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NEISSERIA GENOMIC SEQUENCES AND METHODS OF THEIR USE 

This application claims priority to provisional U.S. applications serial nos. 
60/103,794, filed 9 October, 1998 and 60/132,068, filed 30 April, 1999, both of which are 
incorporated in fiill herein by reference. 

This invention relates to methods of obtaining antigens and immunogens, the antigens 
and immunogens so obtained, and nucleic acids fi-om the bacterial species: Neisseria 
meningitidis. In particular, it relates to genomic sequences from the bacterium; more 
particularly its "B" serogroup. 

BACKGROUND 

Neisseria meningitidis is a non-motile, gram negative diplococcus human pathogen. 
It colonizes the pharynx, causing meningitis and, occasionally, septicaemia in the absence of 
meningitis. It is closely related to N. gonorrhoea, although one feature that clearly 
differentiates meningococcus from gonococcus is the presence of a polysaccharide capsule 
that is present in all pathogenic meningococci. 

A^. meningitidis causes both endemic and epidemic disease. In the United States the 
attack rate is 0.6-1 per 100,000 persons per year, and it can be much greater during outbreaks, 
(see Lieberman et al. (1996) Safety and Immunogenicity of a Serogroups A/C Neisseria 
meningitidis Oligosaccharide-Protein Conjugate Vaccine in Young Children. JAMA 
275(19):1499-1503; Schuchat et al (1997) Bacterial Meningitis in the United States in 1995. 
NEnglJMed 337(1 4):970-976). In developing countries, endemic disease rates are much 
higher and during epidemics incidence rates can reach 500 cases per 100,000 persons per 
year. Mortality is extremely high, at 10-20% in the United States, and much higher in 
developing countries. Following the introduction of the conjugate vaccine against 
Haemophilus influenzae, N. meningitidis is the major cause of bacterial meningitis at all ages 
in the United States (Schuchat etal{\ 997) supra). 

Based on the organism's capsular polysaccharide, 12 serogroups of N. meningitidis 
have been identified. Group A is the pathogen most often implicated in epidemic disease in 
sub-Saharan Afiica. Serogroups B and C are responsible for the vast majority of cases in the 
United States and in most developed countries. Serogroups W135 and Y are responsible for 
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the rest of the cases in the United States and developed countries. The meningococcal 
vaccine currently in use is a tetravalent polysaccharide vaccine composed of serogroups A, C, 
Y and Wl 35 . Although eflficacious in adolescents and adults, it induces a poor immune 
response and short duration of protection, and cannot be used in infants (e.g., Morbidity and 
5 Mortality weekly report. Vol. 46, No. RR-5 (1997)). This is because polysaccharides are T- 
cell independent antigens that induce a weak immune response that cannot be boosted by 
repeated immunization. Following the success of the vaccination against H. influenzae, 
conjugate vaccines against serogroups A and C have been developed and are at the final stage 
of clinical testing (Zollinger WD "New and Improved Vaccines Against Meningococcal 

1 0 Disease". In: New Generation Vaccines, supra, pp. 469-488; Lieberman et al (1 996) supra; 
Costantino et al (1992) Development and phase I clinical testing of a conjugate vaccine 
against meningococcus A (men A) and C (menC) {Vaccine 10:691-698)). 

Meningococcus B (MenB) remains a problem, however. This serotype currently is 
responsible for approximately 50% of total meningitis in the United States, Europe, and 

15 South America. The polysaccharide approach cannot be used because the MenB capsular 
polysaccharide is a polymer of a(2-8)-linked AT-acetyl neuraminic acid that is also present in 
mammalian tissue. This results in tolerance to the antigen; indeed, if an immune response 
were elicited, it would be anti-self, and therefore undesirable. In order to avoid induction of 
autoimmimity and to induce a protective immune response, the capsular polysaccharide has, 

20 for instance, been chemically modified substituting the iV-acetyl groups with iV-propionyl 
groups, leaving the specific antigenicity unaltered (Romero & Outschoom (1994) Current 
status of Meningococcal group B vaccine candidates: capsular or non-capsular? Clin 
Microbiol Rev 7(4):559-575). 

Alternative approaches to MenB vaccines have used complex mixtures of outer 

25 membrane proteins (OMPs), containing either the OMPs alone, or OMPs enriched in porins, 
or deleted of the class 4 OMPs that are believed to induce antibodies that block bactericidal 
activity. This approach produces vaccines that are not well characterized. They are able to 
protect against the homologous strain, but are not effective at large where there are many 
antigenic variants of the outer membrane proteins. To overcome the antigenic variability, 

30 multivalent vaccines containing up to nine different porins have been constructed (e.g., 

Poobnan JT (1992) Development of a meningococcal vaccine. Infect. Agents Dis. 4:13-28). 
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Additional proteins to be used in outer membrane vaccines have been the opa and opc 
proteins, but none of these approaches have been able to overcome the antigenic variabiUty 
(e.g., Ala' Aldeen & Borriello (1996) The meningococcal transfeirin-binding proteins 1 and 2 
are both surface exposed and generate bactericidal antibodies capable of killing homologous 
5 and heterologous strains. Facczwe 14(l):49-53). 

A certain amount of sequence data is available for meningococcal and gonococcal 
genes and proteins (e.g., EP-A-0467714, W096/29412), but this is by no means complete. 
The provision of further sequences could provide an opportunity to identify secreted or 
surface-exposed proteins that are presumed targets for the immune system and which are not 

1 0 antigenically variable or at least are more antigenically conserved than other and more 
variable regions. Thus, those antigenic sequences that are more highly conserved are 
preferred sequences. Those sequences specific to Neisseria meningitidis or Neisseria 
gonorrhoeae that are more highly conserved are further preferred sequences. For instance, 
some of the identified proteins could be components of efficacious vaccines against 

15 meningococcus B, some could be components of vaccines against all meningococcal 

serotypes, and others could be components of vaccines against all pathogenic Neisseriae. 
The identification of sequences from the bacterium will also facilitate the production of 
biological probes, particularly organism-specific probes. 

It is thus an object of the invention is to provide Neisserial DNA sequences which 

20 (1) encode proteins predicted and/or shown to be antigenic or immunogenic, (2) can be used 
as probes or amphfication primers, and (3) can be analyzed by bioinformatics. 



BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. 1 illustrates the products of protein expression and purification of the predicted 
25 ORF 919 as cloned and expressed in E. coli. 

Fig. 2 illustrates the products of protein expression and purification of the predicted 
ORF 279 as cloned and expressed in E. coli. 

Fig. 3 illustrates the products of protein expression and purification of the predicted 
ORF 576-1 as cloned and expressed in E. coli. 
30 Fig. 4 illustrates the products of protein expression and purification of the predicted 

ORF 519-1 as cloned and expressed 'va.E. coli. 
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Fig. 5 illustrates the products of protein expression and purification of the predicted 
ORF 121-1 as cloned and expressed inE. coli. 

Fig. 6 illustrates the products of protein expression and purification of the predicted 
ORF 128-1 as cloned and expressed in ^. coli. 
5 Fig. 7 illustrates the products of protein expression and purification of the predicted 

ORF 206 as cloned and expressed in E. coli. 

Fig. 8 illustrates the products of protein expression and purification of the predicted 
ORF 287 as cloned and expressed in E. coli. 

Fig. 9 illustrates the products of protein expression and purification of the predicted 
1 0 ORF 406 as cloned and expressed in E. coli. 

Fig. 10 illustrates the hydrophilicity plot, antigenic index and AMPHl regions of the 
products of protein expression the predicted ORF 919 as cloned and expressed in E. coli. 

Fig. 1 1 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the 
products of protein expression the predicted ORF 279 as cloned and expressed in E. coli. 
15 Fig. 12 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the 

products of protein expression the predicted ORF 576-1 as cloned and expressed in E. coli. 

Fig. 13 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the 
products of protein expression the predicted ORF 519-1 as cloned and expressed in E. coli. 

Fig. 14 illustrates the hydrophiUcity plot, antigenic index and AMPHI regions of the 
20 products of protein expression the predicted ORF 121-1 as cloned and expressed in E. coli. 

Fig. 15 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the 
products of protein expression the predicted ORF 128-1 as cloned and expressed in E. coli. 

Fig. 16 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the 
products of protein expression the predicted ORF 206 as cloned and expressed in E. coli. 
25 Fig. 17 illustrates the hydrophilicity plot, antigenic index and AMPHI regions of the 

products of protein expression the predicted ORF 287 as cloned and expressed in E. coli. 

Fig. 18 illustrates the hydrophihcity plot, antigenic index and AMPHI regions of the 
products of protein expression the predicted ORF 406 as cloned and expressed in E. coli. 
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THE INVENTION 

The invention is based on the 961 nucleotide sequences from the genome of 
A^. meningitidis shown as SEQ ID NOs: 1-961 of Appendix C, and the full length genome of 
N. meningitidis shown as SEQ ID NO. 1068 in Appendix D. The 961 sequences in Appendix 
5 C represent substantially the whole genome of serotype B ofN. meningitidis (>99.98%). 
There is partial overlap between some of the 961 contiguous sequences ("contigs") shown in 
the sequences in Appendix C, which overlap was used to construct the single full length 
sequence shown in SEQ ID NO. 1068 in Appendix D, using the TIGR Assembler [G.S. 
Sutton et al., TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing 

10 Projects, Genome Science and Technology, 1 :9-19 (1995)]. Some of the nucleotides in the 
contigs had been previously released. (See ftp: 1 1 ftp.tigr.org/pub/data/n_meningitidis on the 
world-wide web or "WWW"). The coordinates of the 2508 released sequences in the present 
contigs are presented in Appendix A. These data include the contig number (or i.d.) as 
presented in the first column; the name of the sequence as found on WWW is in the second 

1 5 column; with the coordinates of the contigs in the third and fourth columns, respectively. 
The sequences of certain MenB ORFs presented in Appendix B feature in International 
Patent Application filed by Chiron SpA on October 9, 1998 (PCT/IB98/01665) and January 
14, 1999 (PCT/IB99/00103) respectively. 

In a first aspect, the invention provides nucleic acid including one or more of the 

20 N. meningitidis nucleotide sequences shown in SEQ ID NOs: 1-961 and 1068 in Appendices 
C and E. It also provides nucleic acid comprising sequences having sequence identity to the 
nucleotide sequence disclosed herein. Depending on the particular sequence, the degree of 
sequence identity is preferably greater than 50% (e.g., 60%, 70%, 80%, 90%, 95%, 99% or 
more). These sequences include, for instance, mutants and allelic variants. The degree of 

25 sequence identity cited herein is determined across the length of the sequence determined by 
the Smith- Waterman homology search algorithm as implemented in MPSRCH program 
(Oxford Molecular) using an affme gap search with the following parameters: gap open 
penalty 12, gap extension penalty 1. 

The invention also provides nucleic acid including a fragment of one or more of the 

30 nucleotide sequences set out herein. The fragment should comprise at least « consecutive 
nucleotides from the sequences and, depending on the particular sequence, « is 10 or more 
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(e.g., 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 75, 100 
or more). Preferably, the fragment is unique to the genome of N. meningitidis, that is to say it 
is not present in the genome of another organism. More preferably, the fragment is unique to 
the genome of strain B ofN. meningitidis. The invention also provides nucleic acid that 
5 hybridizes to those provided herein. Conditions for hybridizing are disclosed herein. 

The invention also provides nucleic acid including sequences complementary to those 
described above (e.g., for antisense, for probes, or for amplification primers). 

Nucleic acid according to the invention can, of course, be prepared in many ways 
(e.g., by chemical synthesis, from DNA libraries, from the organism itself, etc.) and can take 
10 various forms (e.g., single-stranded, double-stranded, vectors, probes, primers, etc.). The 

term "nucleic acid" includes DNA and RNA, and also their analogs, such as those containing 
modified backbones, and also peptide nucleic acid (PNA) etc. 

It will be appreciated that, as SEQ ID NOs: 1-961 represent the substantially complete 
genome of the organism, with partial overlap, references to SEQ ID NOs: 1-961 include 
15 within their scope references to the complete genomic sequence, e.g., where two SEQ ID 
NOs overlap, the invention encompasses the single sequence which is formed by assembling 
the two overlapping sequences. Thus, for instance, a nucleotide sequence which bridges two 
SEQ ID NOs but is not present in its entirety in either SEQ ID NO is still within the scope of 
the invention. Additionally, such a sequence will be present in its entirety in the single full 
20 length sequence of SEQ ID NO. 1068. 

The invention also provides vectors including nucleotide sequences of the invention 
(e.g., ejcpression vectors, sequencing vectors, cloning vectors, etc.) and host cells transformed 
with such vectors. 

According to a further aspect, the invention provides a protein including an amino 
25 acid sequence encoded within a A^. meningitidis nucleotide sequence set out herein. It also 
provides proteins comprising sequences having sequence identity to those proteins. 
Depending on the particular sequence, the degree of sequence identity is preferably greater 
than 50% (e.g., 60%, 70%, 80%, 90%, 95%, 99% or more). Sequence identity is determined 
as above disclosed. These homologous proteins include mutants and allelic variants, encoded 
30 within the N. meningitidis nucleotide sequence set out herein. 
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The invention further provides proteins including fragments of an amino acid 
sequence encoded within a N. meningitidis nucleotide sequence set out in the sequence 
listing. The fragments should comprise at least n consecutive amino acids from the 
sequences and, depending on the particular sequence, n is 7 or more (e.g., 8, 10, 12, 14, 16, 
5 1 8, 20 or more). Preferably the fragments comprise an epitope from the sequence. 

The proteins of the invention can, of course, be prepared by various means (e.g., 
recombinant expression, purification from cell culture, chemical synthesis, etc.) and in 
various forms (e.g. native, fusions etc.). They are preferably prepared in substantially 
isolated form (i.e., substantially free from other iV. meningitidis host cell proteins). 
10 Various tests can be used to assess the in vivo immunogenicity of the proteins of the 

invention. For example, the proteins can be expressed recombinantly or chemically 
synthesized and used to screen patient sera by immunoblot. A positive reaction between the 
protein and patient serum indicates that the patient has previously mounted an immune 
response to the protein in question; i.e., the protein is an immunogen. This method can also 
15 be used to identify immunodominant proteins. 

The invention also provides nucleic acid encoding a protein of the invention. 

In a fiirther aspect, the invention provides a computer, a computer memory, a 
computer storage medium (e.g., floppy disk, fixed disk, CD-ROM, etc.), and/or a computer 
database containing the nucleotide sequence of nucleic acid according to the invention. 
20 Preferably, it contains one or more of the N. meningitidis nucleotide sequences set out herein. 

This may be used in the analysis of the N. meningitidis nucleotide sequences set out 
herein. For instance, it may be used in a search to identify open reading frames (ORFs) or 
coding sequences within the sequences. 

In a further aspect, the invention provides a method for identifying an amino acid 
25 sequence, comprising the step of searching for putative open reading frames or protein- 
coding sequences within a N. meningitidis nucleotide sequence set out herein. Similarly, the 
invention provides the use of a N. meningitidis nucleotide sequence set out herein in a search 
for putative open reading frames or protein-coding sequences. 

Open-reading frame or protein-coding sequence analysis is generally performed on a 
30 computer using standard bioinformatic techniques. Typical algorithms or program used in 
the analysis include ORFFINDER (NCBI), GENMARK [Borodovsky & Mclninch (1993) 
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Computers Chem 17:122-133], and GLIMMER [Salzberg et al. (1998) Nucl Acids Res 
26:544-548]. 

A search for an open reading frame or protein-coding sequence may comprise the 
steps of searching a JV. meningitidis nucleotide sequence set out herein for an initiation codon 
5 and searching the upstream sequence for an in-frame termination codon. The intervening 
codons represent a putative protein-coding sequence. Typically, all six possible reading 
frames of a sequence will be searched. 

An amino acid sequence identified in this way can be expressed using any suitable 
system to give a protein. This protein can be used to raise antibodies which recognize 
10 epitopes within the identified amino acid sequence. These antibodies can be used to screen 
A': meningitidis to detect the presence of a protein comprising the identified amino acid 
sequence. 

Furthermore, once an ORF or protein-coding sequence is identified, the sequence can 
be compared with sequence databases. Sequence analysis tools can be found at NCBI 

1 5 (http://www.ncbi.nlm.nih.gov) e.g., the algorithms BLAST, BLAST2, BLASTn, BLASTp, 
tBLASTn, BLASTx, & tBLASTx [see also Altschul et al. (1997) Gapped BLAST and PSI- 
BLAST: new generation of protein database search programs. Nucleic Acids Research 
25:2289-3402]. Suitable databases for comparison include the nonredundant GenBank, 
EMBL, DDBJ and PDB sequences, and the nonredundant GenBank CDS franslations, PDB, 

20 SwissProt, Spupdate and PIR sequences. This comparison may give an indication of the 
fimction of a protein. 

Hydrophobic domains in an amino acid sequence can be predicted using algorithms 
such as those based on the statistical studies of Esposti et al. [Critical evaluation of the 
hydropathy of membrane proteins (1990) Eur JBiochem 190:207-219]. Hydrophobic 

25 domains represent potential transmembrane regions or hydrophobic leader sequences, which 
suggest that the proteins may be secreted or be surface-located. These properties are 
typically representative of good immunogens. 

Similarly, transmembrane domains or leader sequences can be predicted using the 
PSORT algorithm (http://www.psort.nibb.ac.jp), and functional domains can be predicted 

30 using the MOTIFS program (GCG Wisconsin & PROSITE). 
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The invention also provides nucleic acid including an open reading frame or protein- 
coding sequence present in a N. meningitidis nucleotide sequence set out herein. 
Furthermore, the invention provides a protein including the amino acid sequence encoded by 
this open reading frame or protein-coding sequence. 
5 According to a further aspect, the invention provides antibodies which bind to these 

proteins. These may be polyclonal or monoclonal and may be produced by any suitable 
means known to those skilled in the art. 

The antibodies of the invention can be used in a variety of ways, e.g., for confirmation 
that a protein is expressed, or to confirm where a protein is expressed. Labeled antibody 

1 0 (e.g., fluorescent labeling for FACS) can be incubated with intact bacteria and the presence of 
label on the bacterial surface confirms the location of the protein, for instance. 

According to a further aspect, the invention provides compositions including protein, 
antibody, and/or nucleic acid according to the invention. These compositions may be suitable 
as vaccines, as immunogenic compositions, or as diagnostic reagents. 

15 The invention also provides nucleic acid, protein, or antibody according to the 

invention for use as medicaments (e.g., as vaccines) or as diagnostic reagents. It also 
provides the use of nucleic acid, protein, or antibody according to the invention in the 
manufacture of (T) a medicament for treating or preventing infection due to Neisseria! 
bacteria (ii) a diagnostic reagent for detecting the presence of Neisserial bacteria or of 

20 antibodies raised against Neisserial bacteria. Said Neisserial bacteria may be any species or 
strain (such as N. gonorrhoeae) but are preferably A': meningitidis, especially strain A, strain 
B or strain C. 

In still yet another aspect, the present invention provides for compositions including 
proteins, nucleic acid molecules, or antibodies. More preferable aspects of the present 
25 invention are drawn to immunogenic compositions of proteins. Further preferable aspects of 
the present invention contemplate pharmaceutical immunogenic compositions of proteins or 
vaccines and the use thereof in the manufacture of a medicament for the treatment or 
prevention of infection due to Neisserial bacteria, preferably infection of MenB. 

The invention also provides a method of treating a patient, comprising administering 
30 to the patient a therapeutically effective amount of nucleic acid, protein, and/or antibody 
according to the invention. 
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According to further aspects, the invention provides various processes. 

A process for producing proteins of the invention is provided, comprising the step of 
culturing a host cell according to the invention under conditions which induce protein 
expression. A process which may further include chemical synthesis of proteins and/or 
5 chemical synthesis (at least in part) of nucleotides. 

A process for detecting polynucleotides of the invention is provided, comprising the 
steps of: (a) contacting a nucleic probe according to the invention with a biological sample 
under hybridizing conditions to form duplexes; and (b) detecting said duplexes. 

A process for detecting proteins of the invention is provided, comprising the steps of: 
10 (a) contacting an antibody according to the invention with a biological sample under 

conditions suitable for the formation of an antibody-antigen complexes; and (b) detecting 
said complexes. 

Another aspect of the present invention provides for a process for detecting antibodies 
that selectably bind to antigens or polypeptides or proteins specific to any species or strain of 

15 Neisseria! bacteria and preferably to strains of N. gonorrhoeae but more preferably to strains 
of A': meningitidis, especially strain A, strain B or strain C, more preferably MenB, where the 
process comprises the steps of: (a) contacting antigen or polypeptide or protein according to 
the invention with a biological sample under conditions suitable for the formation of an 
antibody-antigen complexes; and (b) detecting said complexes. 

20 Having now generally described the invention, the same will be more readily 

understood through reference to the following examples which are provided by way of 
illustration, and are not intended to be limiting of the present invention, unless specified. 

Methodology - Sunmary of standard procedures and techniques. 

25 General 

This invention provides Neisseria meningitidis MenB nucleotide sequences, amino 
acid sequences encoded therein. With these disclosed sequences, nucleic acid probe assays 
and expression cassettes and vectors can be produced. The proteins can also be chemically 
synthesized. The expression vectors can be transformed into host cells to produce proteins. 
30 The purified or isolated polypeptides can be used to produce antibodies to detect MenB 
proteins. Also, the host cells or extracts can be utilized for biological assays to isolate 
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agonists or antagonists. In addition, with these sequences one can search to identify open 
reading frames and identify amino acid sequences. The proteins may also be used in 
immunogenic compositions and as vaccine components. 

The practice of the present invention will employ, unless otherwise indicated, 
5 conventional techniques of molecular biology, microbiology, recombinant DNA, and 

immimology, which are within the skill of the art. Such techniques are explained fiilly in the 
literature e.g., Ssanhrook Molecular Cloning; A Laboratory Manual. Second Edition (1989); 
DNA Cloning, Volumes I and ii (D.N Glover ed. 1985); Oligonucleotide Synthesis (M.J. Gait 
ed, 1984); Nucleic Acid Hybridization (B.D. Hames & S.J. Higgins eds. 1984); Transcription 

10 and Translation (B.D. Hames & S.J. Higgins eds. 1984); Animal Cell Culture (R.I. Freshney 
ed. 1986); Immobilized Cells and Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide to 
Molecular Cloning (1984); the Methods in Enzymology series (Academic Press, Inc.), 
especially volumes 154 & 155; Gene Transfer Vectors for Mammalian Cells (J.H. Miller and 
M.P. Calos eds. 1987, Cold Spring Harbor Laboratory); Mayer and Walker, eds. (1987), 

15 Immunochemical Methods in Cell and Molecular Biology (Academic Press, London); Scopes, 
(1987) Protein Purification: Principles and Practice, Second Edition (Springer- Verlag, 
N.Y.), md Handbook of Experimentallmmunology. Volumes I-IV (D.M. Weir and C.C. 
Blackwell eds 1986). 

Standard abbreviations for nucleotides and amino acids are used in this specification. 

20 All publications, patents, and patent applications cited herein are incorporated in full 

by reference. 

Expression systems 

The Neisseria MenB nucleotide sequences can be expressed in a variety of different 
25 expression systems; for example those used with mammalian cells, plant cells, baculoviruses, 
bacteria, and yeast. 

i. Mammalian Systems 

Mammalian expression systems are known in the art. A mammahan promoter is any 
30 DNA sequence capable of binding mammalian RNA polymerase and initiating the 

downstream (3') transcription of a coding sequence (e.g., structural gene) into mRNA. A 
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promoter will have a transcription initiating region, wiiich is usually placed proximal to the 5' 
end of the coding sequence, and a TATA box, usually located 25-30 base pairs (bp) upstream 
of the transcription initiation site. The TATA box is thought to direct RNA polymerase II to 
begin RNA synthesis at the correct site. A mammalian promoter will also contain an 
5 upstream promoter element, usually located within 1 00 to 200 bp upstream of the TATA box. 
An upstream promoter element determines the rate at which transcription is initiated and can 
act in either orientation (Sambrook et al. (1989) "Expression of Cloned Genes in Mammalian 
Cells." In Molecular Cloning: A Laboratory Manual, 2nd ed.). 

Mammalian viral genes are often highly expressed and have a broad host range; 

10 therefore sequences encoding mammalian viral genes provide particularly useful promoter 
sequences. Examples include the SV40 early promoter, mouse mammary tumor virus LTR 
promoter, adenovirus major late promoter (Ad MLP), and herpes simplex virus promoter. In 
addition, sequences derived from non-viral genes, such as the murine metallothionein gene, 
also provide useful promoter sequences. Expression may be either constitutive or regulated 

1 5 (inducible). Depending on the promoter selected, many promotes may be inducible using 
known substrates, such as the use of the mouse mammary tumor virus (MMTV) promoter 
with the glucocorticoid responsive element (GRE) that is induced by glucocorticoid in 
hormone-responsive transformed cells (see for example, U.S. Patent 5,783,681). 

The presence of an enhancer element (enhancer), combined with the promoter 

20 elements described above, will usually increase expression levels. An enhancer is a 

regulatory DNA sequence that can stimulate transcription up to 1000-fold when linked to 
homologous or heterologous promoters, with synthesis beginning at the normal RNA start 
site. Enhancers are also active when they are placed upstream or downstream from the 
transcription initiation site, in either normal or flipped orientation, or at a distance of more 

25 than 1000 nucleotides from the promoter (Maniatis et al. (1987) Science 236:1231; Alberts et 
al. (1989) Molecular Biology of the Cell, 2nd ed.). Enhancer elements derived from viruses 
may be particularly usefiil, because they usually have a broader host range. Examples 
include the SV40 eariy gene enhancer (Dijkema et al (1985) EMBO J. 4:761) and the 
enhancer/promoters derived from the long terminal repeat (LTR) of the Rous Sarcoma Virus 

30 (Gorman et al. (1982b) Proc. Natl Acad. Set. 79:6111) and from human cytomegalovirus 
(Boshart et al. (1985) Cell 4h51\). Additionally, some enhancers are regulatable and 
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become active only in the presence of an inducer, such as a hormone or metal ion (Sassone- 
Corsi and Borelli (1986) Trends Genet. 2:215; Maniatis et al. (1987) Science 236:1237). 

A DNA molecule may be expressed intracellularly in mammalian cells. A promoter 
sequence may be directly linked with the DNA molecule, in which case the first amino acid 
5 at the N-terminus of the recombinant protein will always be a methionine, which is encoded 
by the ATG start codon. If desired, the N-terminus may be cleaved from the protein by in 
vitro incubation with cyanogen bromide. 

Alternatively, foreign proteins can also be secreted from the cell into the growth 
media by creating chimeric DNA molecules that encode a fusion protein comprised of a 

10 leader sequence fragment that provides for secretion of the foreign protein in mammalian 
cells. Preferably, there are processing sites encoded between the leader fragment and the 
foreign gene that can be cleaved either in vivo or in vitro. The leader sequence fragment 
usually encodes a signal peptide comprised of hydrophobic amino acids which dhect the 
secretion of the protein from the cell. The adenovirus tripartite leader is an example of a 

15 leader sequence that provides for secretion of a foreign protein in mammalian cells. 

Usually, franscription termination and polyadenylation sequences recognized by 
mammalian cells are regulatory regions located 3' to the franslation stop codon and thus, 
together with the promoter elements, flank the coding sequence. The 3' terminus of the 
mature mRNA is formed by site-specific post-transcriptional cleavage and polyadenylation 

20 (Bimstiel et al. (1 985) Cell 41 :349; Proudfoot and Whitelaw (1988) "Termination and 3' end 
processing of eukaryotic RNA. In Transcription and splicing (ed. B.D. Hames and D.M. 
Glover); Proudfoot (1989) Trends Biochem. Sci. 14:105). These sequences direct the 
transcription of an mRNA which can be franslated into the polypeptide encoded by the DNA. 
Examples of franscription terminator/polyadenylation signals include those derived from 

25 SV40 (Sambrook et al (1989) "Expression of cloned genes in cultured mammahan cells." In 
Molecular Cloning: A Laboratory Manual). 

Usually, the above-described components, comprising a promoter, polyadenylation 
signal, and franscription termination sequence are put together into expression constructs. 
Enhancers, infrons with fimctional spUce donor and acceptor sites, and leader sequences may 

30 also be included in an expression construct, if desired. Expression constructs are often 

maintained in a replicon, such as an exfrachromosomal element (e.g., plasmids) capable of 
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Stable maintenance in a host, such as mammalian cells or bacteria. Mammalian replication 
systems include those derived from animal viruses, which require trans-acting factors to 
replicate. For example, plasmids containing the replication systems of papovaviruses, such 
as SV40 (Gluzman (1981) Cell 23:\15) or polyomavirus, replicate to extremely high copy 
5 number in the presence of the appropriate viral T antigen. Additional examples of 

mammalian replicons include those derived from bovine papillomavirus and Epstein-Barr 
virus. Additionally, the replicon may have two replication systems, thus allowing it to be 
maintained, for example, in mammalian cells for expression and in a prokaryotic host for 
cloning and ampUfication. Examples of such mammalian-bacteria shuttle vectors include 

10 pMT2 (Kaufinan et al. (1989) Mol. Cell. Biol. P:946) and pHEBO (Shimizu et al. (1 986) Mol. 
Cell. Biol. 5:1074). 

The transformation procedure used depends upon the host to be transformed. 
Methods for introduction of heterologous polynucleotides into mammalian cells are known in 
the art and include dextran-mediated transfection, calcium phosphate precipitation, polybrene 

15 mediated transfection, protoplast fusion, electroporation, encapsulation of the 
polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei. 

Mammalian cell lines available as hosts for expression are known in the art and 
include many immortalized cell lines available from the American Type Culture Collection 
(ATCC), including but not limited to, Chinese hamster ovary (CHO) cells, HeLa cells, baby 

20 hamster kidney (BHK) cells, monkey kidney cells (COS), human hepatocellular carcinoma 
cells (e.g.. Hep G2), and a number of other cell lines. 

ii. Plant Cellular Expression Systems 

There are many plant cell culture and whole plant genetic expression systems known 

25 in the art. Exemplary plant cellular genetic expression systems include tliose described in 
patents, such as: U.S. 5,693,506; US 5,659,122; and US 5,608,143. Additional examples of 
genetic expression in plant cell culture has been described by Zenk, Phytochemistry 30:3861- 
3863 (1991). Descriptions of plant protein signal peptides may be found in addition to the 
references described above in Vaulcombe et al., Mol Gen. Genet. 209:33-40 (1987); 

30 Chandler et al.. Plant Molecular Biology 3:407-418 (1984); Rogers, J. Biol. Chem. 260:3731- 
3738 (1985); Rothstein et al.. Gene 55:353-356 (1987); Whittier et al.. Nucleic Acids 
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Research 15:2515-2535 (1987); Wirsel et al, Molecular Microbiology 3:3-14 (1989); Yu et 
al., Gene 122:247-253 (1992). A description of the regulation of plant gene expression by the 
phytohormone, gibberellic acid and secreted enzymes induced by gibbereUic acid can be 
found in R.L. Jones and J. MacMillin, Gibberellins: in: Advanced Plant Physiology,. 
5 Malcolm B. Wilkins, ed., 1984 Pitman Publishing Limited, London, pp. 21-52. References 
that describe other metabolically-regulated genes: Sheen, Plant Cell, 2:1027-1038(1990); 
Maas et al, EMBOJ. 9:3447-3452 (1990); Benkel and Hickey, Proc. Natl. Acad Sci. 
84:1337-1339(1987) 

Typically, using techniques known in the art, a desired polynucleotide sequence is 

1 0 inserted into an expression cassette comprising genetic regulatory elements designed for 

operation in plants. The expression cassette is inserted into a desired expression vector with 
companion sequences upstream and downstream from the expression cassette suitable for 
expression in a plant host. The companion sequences will be of plasmid or viral origin and 
provide necessary characteristics to the vector to permit the vectors to move DNA from an 

15 original cloning host, such as bacteria, to the desired plant host. The basic bacterial/plant 
vector construct will preferably provide a broad host range prokaryote replication origin; a 
prokaryote selectable marker; and, for Agrobacterium transformations, T DNA sequences for 
Agrobacteriimi-mediated transfer to plant chromosomes. Where the heterologous gene is not 
readily amenable to detection, the construct will preferably also have a selectable marker 

20 gene suitable for determining if a plant cell has been transformed. A general review of 

suitable markers, for example for the members of the grass family, is found in Wilmink and 
Dons, 1993, Plant Mol. Biol. Reptr, 11(2):165-185. 

Sequences suitable for permitting integration of the heterologous sequence into the 
plant genome are also recommended. These might include transposon sequences and the like 

25 for homologous recombination as well as Ti sequences which permit random insertion of a 
heterologous expression cassette into a plant genome. Suitable prokaryote selectable markers 
include resistance toward antibiotics such as ampicillin or tetracycline. Other DNA 
sequences encoding additional functions may also be present in the vector, as is known in the 
art. 

30 The nucleic acid molecules of the subject invention may be included into an 

expression cassette for expression of the protein(s) of interest. Usually, there will be only 
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one expression cassette, although two or more are feasible. The recombinant expression 
cassette will contain in addition to the heterologous protein encoding sequence the following 
elements, a promoter region, plant 5' untranslated sequences, initiation codon depending upon 
whether or not the structural gene comes equipped with one, and a transcription and 
5 translation termination sequence. Unique restriction enzyme sites at the 5' and 3' ends of the 
cassette allow for easy insertion into a pre-existing vector. 

A heterologous coding sequence may be for any protein relating to the present 
invention. The sequence encoding the protein of interest will encode a signal peptide which 
allows processing and translocation of the protein, as appropriate, and will usually lack any 

10 sequence which might result in the binding of the desired protein of the invention to a 

membrane. Since, for the most part, the transcriptional initiation region will be for a gene 
which is expressed and translocated during germination, by employing the signal peptide 
which provides for translocation, one may also provide for translocation of the protein of 
interest. In this way, the protein(s) of interest will be translocated from the cells in which 

15 they are expressed and may be efficiently harvested. Typically secretion in seeds are across 
the aleurone or scutellar epithelium layer into the endosperm of the seed. While it is not 
required that the protein be secreted from the cells in which the protein is produced, this 
facilitates the isolation and purification of the recombinant protein. 

Since the ultimate expression of the desired gene product will be in a eucaryotic cell it 

20 is desirable to determine whether any portion of the cloned gene contains sequences which 
will be processed out as infrons by the host's splicosome machinery. If so, site-directed 
mutagenesis of the "infron" region may be conducted to prevent losing a portion of the 
genetic message as a false intron code, Reed and Maniatis, Cell 41:95-105, 1985. 

The vector can be microinjected directly into plant cells by use of micropipettes to 

25 mechanically transfer the recombinant DNA. Crossway, Mol Gen. Genet, 202: 179-185, 

1985. The genetic material may also be transferred into the plant cell by using polyethylene 
glycol, Krens, et al., Nature, 296, 72-74, 1982. Another method of introduction of nucleic 
acid segments is high velocity ballistic penetration by small particles with the nucleic acid 
either within the matrix of small beads or particles, or on the surface, Klein, et al., Nature, 

30 327, 70-73, 1987 and Knudsen and MuUer, 1991, Planta, 185:330-336 teaching particle 
bombardment of barley endosperm to create transgenic barley. Yet another method of 
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introduction would be fusion of protoplasts with other entities, either minicells, cells, 
lysosomes or other fusible lipid-surfaced bodies, Fraley, et al., Proc. Natl. Acad. Sci. USA, 
79, 1859-1863, 1982. 

The vector may also be introduced into the plant cells by electroporation. (Fromm et 
5 al., Proc. Natl Acad. Sci. USA 82:5824, 1 985). In this technique, plant protoplasts are 

electroporated in the presence of plasmids containing the gene construct. Electrical impulses 
of high field strength reversibly permeabiUze biomembranes allowing the introduction of the 
plasmids. Electroporated plant protoplasts reform the cell wall, divide, and form plant callus. 
All plants from which protoplasts can be isolated and cultured to give whole 

10 regenerated plants can be transformed by the present invention so that whole plants are 

recovered which contain the transferred gene. It is known that practically all plants can be 
regenerated from cultured cells or tissues, including but not limited to all major species of 
sugarcane, sugar beet, cotton, fruit and other trees, legumes and vegetables. Some suitable 
plants include, for example, species from the genera Fragaria, Lotus, Medicago, Onobrychis, 

15 Trifolium, Trigonella, Vigna, Citrus, Linum, Geranium, Manihot, Daucus, Arabidopsis, 
Brassica, Raphanus, Sinapis, Atropa, Capsicum, Datura, Hyoscyamus, Lycopersion, 
Nicotiana, Solanum, Petunia, Digitalis, Majorana, Cichorium, Helianthus, Lactuca, Bromus, 
Asparagus, Antirrhinum, Hererocallis, Nemesia, Pelargonium, Panicum, Pennisetum, 
Ranunculus, Senecio, Salpiglossis, Cucumis, Browaalia, Glycine, Lolium, Zea, Triticum, 

20 Sorghum, and Datura. 

Means for regeneration vary from species to species of plants, but generally a 
suspension of transformed protoplasts containing copies of the heterologous gene is first 
provided. Callus tissue is formed and shoots may be induced from callus and subsequently 
rooted. Alternatively, embryo fomiation can be induced from the protoplast suspension. 

25 These embryos germinate as natural embryos to form plants. The culture media will 

generally contain various amino acids and hormones, such as auxin and cytokinins. It is also 
advantageous to add glutamic acid and proline to the medium, especially for such species as 
com and alfalfa. Shoots and roots normally develop simultaneously. Efficient regeneration 
will depend on the medium, on the genotype, and on the history of the culture. If these three 

30 variables are controlled, then regeneration is fully reproducible and repeatable. 
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In some plant cell culture systems, the desired protein of the invention may be 
excreted or alternatively, the protein may be extracted from the whole plant. Where the 
desired protein of the invention is secreted into the medium, it may be collected. 
Alternatively, the embryos and embryoless-half seeds or other plant tissue maybe 
5 mechanically disrupted to release any secreted protein between cells and tissues. The mixture 
may be suspended in a buffer solution to retrieve soluble proteins. Conventional protein 
isolation and purification methods will be then used to purify the recombinant protein. 
Parameters of time, temperature pH, oxygen, and volumes will be adjusted through routine 
methods to optimize expression and recovery of heterologous protein. 

10 

iii. Baculovirus Systems 

The polynucleotide encoding the protein can also be inserted into a suitable insect 
expression vector, and is operably linked to the control elements within that vector. Vector 
construction employs techniques which are known in the art. Generally, the components of 

1 5 the expression system include a transfer vector, usually a bacterial plasmid, which contains 
both a fragment of the baculovirus genome, and a convenient restriction site for insertion of 
the heterologous gene or genes to be expressed; a wild type baculovirus with a sequence 
homologous to the baculovirus-specific fragment in the transfer vector (this allows for the 
homologous recombination of the heterologous gene in to the baculovirus genome); and 

20 appropriate insect host cells and growth media. 

After inserting the DNA sequence encoding the protein into the transfer vector, the 
vector and the vidld type viral genome are transfected into an insect host cell where the vector 
and viral genome are allowed to recombine. The packaged recombinant virus is expressed 
and recombinant plaques are identified and purified. Materials and methods for 

25 baculovirus/insect cell expression systems are commercially available in kit form from, inter 
alia, Invitrogen, San Diego CA ("MaxBac" kit). These techniques are generally known to 
those skilled in the art and fully described in Summers and Smith, Texas Agricultural 
Experiment Station Bulletin No. 1555 (1987) (hereinafter "Summers and Smith"). 

Prior to inserting the DNA sequence encoding the protein into the baculovirus 

30 genome, the above described components, comprising a promoter, leader (if desired), coding 
sequence of interest, and transcription termination sequence, are usually assembled into an 
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intermediate transplacement construct (transfer vector). This construct may contain a single 
gene and operably linked regulatory elements; multiple genes, each with its owned set of 
operably linked regulatory elements; or multiple genes, regulated by the same set of 
regulatory elements. Intermediate transplacement constructs are often maintained in a 
5 replicon, such as an extrachromosomal element (e.g., plasmids) capable of stable 

maintenance in a host, such as a bacteriirai. The replicon will have a replication system, thus 
allowing it to be maintained in a suitable host for cloning and amplification. 

Currently, the most commonly used transfer vector for introducing foreign genes into 
AcNPV is pAc373. Many other vectors, known to those of skill in the art, have also been 

10 designed. These include, for example, pVL985 (which alters the polyhedrin start codon firom 
ATG to ATT, and which introduces a BamHI cloning site 32 basepairs downstream from the 
ATT; see Luckow and Summers, Virology (1989) 77:31. 

The plasmid usually also contains the polyhedrin polyadenylation signal (Miller et al. 
(1988) Ann. Rev. Microbiol., 42:\11) and a prokaryotic ampicillin-resistance {amp) gene and 

15 origin of replication for selection and propagation in E. colt. 

Baculovirus transfer vectors usually contain a baculovirus promoter. A baculovirus 
promoter is any DNA sequence capable of binding a baculovirus RNA polymerase and 
initiating the downstream (5' to 3') transcription of a coding sequence (e.g., structural gene) 
into mRNA. A promoter will have a transcription initiation region which is usually placed 

20 proximal to the 5' end of the coding sequence. This transcription initiation region usually 
includes an RNA polymerase binding site and a transcription initiation site. A baculovirus 
transfer vector may also have a second domain called an enhancer, which, if present, is 
usually distal to the structural gene. Expression may be either regulated or constitutive. 
Structural genes, abundantly transcribed at late times in a viral infection cycle, 

25 provide particularly useful promoter sequences. Examples include sequences derived from 
the gene encoding the viral polyhedron protein, Friesen et al., (1986) "The Regulation of 
Baculovirus Gene Expression," in: The Molecular Biology of Baculoviruses (ed. Walter 
Doerfler); EPO Publ. Nos. 127 839 and 155 476; and the gene encoding the plO protein, Vlak 
etal.,(1988),J. Gen. Virol. 69:765. 

30 DNA encoding suitable signal sequences can be derived from genes for secreted 

insect or baculovirus proteins, such as the baculovirus polyhedrin gene (Carbonell et al. 
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(1988) Gene, 75:409). Alternatively, since the signals for mammalian cell posttranslational 
modifications (such as signal peptide cleavage, proteolytic cleavage, and phosphorylation) 
appear to be recognized by insect cells, and the signals required for secretion and nuclear 
accumulation also appear to be conserved between the invertebrate cells and vertebrate cells, 
5 leaders of non-insect origin, such as those derived firom genes encoding human (alpha) a- 
interferon, Maeda et al., (1985), Nature 315:592; human gastrin-releasing peptide, Lebacq- 
Verheyden et al., (1988), Molec. Cell. Biol. 5:3129; human IL-2, Smith et al., (1985) Proc. 
Nat'lAcad. Sci. USA, 52:8404; mouse IL-3, (Miyajima et al., (1987) Gene 55:273; and 
himian glucocerebrosidase, Martin et al. (1988) DNA, 7:99, can also be used to provide for 

10 secretion in insects. 

A recombinant polypeptide or polyprotein may be expressed intracellularly or, if it is 
expressed with the proper regulatory sequences, it can be secreted. Good intracellular 
expression of nonfused foreign proteins usually requires heterologous genes that ideally have 
a short leader sequence containing suitable translation initiation signals preceding an ATG 

1 5 start signal. If desired, methionine at the N-terminus may be cleaved from the mature protein 
by in vitro incubation with cyanogen bromide. 

Alternatively, recombinant polyproteins or proteins which are not naturally secreted 
can be secreted from the insect cell by creating chimeric DNA molecules that encode a fusion 
protein comprised of a leader sequence fragment that provides for secretion of the foreign 

20 protein in insects. The leader sequence fragment usually encodes a signal peptide comprised 
of hydrophobic amino acids which direct the translocation of the protein into the endoplasmic 
reticulum. 

After insertion of the DNA sequence and/or the gene encoding the expression product 
precursor of the protein, an insect cell host is co-transformed with the heterologous DNA of 

25 the transfer vector and the genomic DNA of wild type baculovirus ~ usually by co- 

transfection. The promoter and transcription termination sequence of the construct will 
usually comprise a 2-5kb section of the baculovirus genome. Methods for introducing 
heterologous DNA into the desired site in the baculovirus virus are known in the art. (See 
Summers and SmiOi supra; Ju et al. (1987); Smith et &\.,Mol. Cell. Biol. (1983) 5:2156; and 

30 Luckow and Summers (1989)). For example, the insertion can be into a gene such as the 

polyhedrin gene, by homologous double crossover recombination; insertion can also be into a 
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restriction enzyme site engineered into the desired baculovirus gene. Miller et al., (1989), 
Bioessays 4:91. The DNA sequence, when cloned in place of the polyhedrin gene in the 
expression vector, is flanked both 5' and 3' by polyhedrin-specific sequences and is 
positioned downstream of the polyhedrin promoter. 
5 The newly formed baculovirus expression vector is subsequently packaged into an 

infectious recombinant baculovirus. Homologous recombination occurs at low frequency 
(between about 1% and about 5%); thus, the majority of the virus produced after 
cotransfection is still wild-type virus. Therefore, a method is necessary to identify 
recombinant viruses. An advantage of the expression system is a visual screen allowing 

10 recombinant viruses to be distinguished. The polyhedrin protein, which is produced by the 
native virus, is produced at very high levels in the nuclei of infected cells at late times after 
viral infection. Accumulated polyhedrin protein forms occlusion bodies that also contain 
embedded particles. These occlusion bodies, up to 15 i^m in size, are highly refractile, giving 
them a bright shiny appearance that is readily visualized under the light microscope. Cells 

15 infected with recombinant viruses lack occlusion bodies. To distinguish recombinant virus 
from wild-type virus, the transfection supernatant is plaqued onto a monolayer of insect cells 
by techniques known to those skilled in the art. Namely, the plaques are screened under the 
light microscope for the presence (indicative of wild-type virus) or absence (indicative of 
recombinant virus) of occlusion bodies. Current Protocols in Microbiology Vol. 2 (Ausubel 

20 et al. eds) at 16.8 (Supp. 10, 1990); Summers and Smith, supra; Miller et al. (1989). 

Recombinant baculovirus expression vectors have been developed for infection into 
several insect cells. For example, recombinant baculoviruses have been developed for, inter 
alia: Aedes aegypti , Autographa californica, Bombyx mori, Drosophila melanogaster, 
Spodoptera frugiperda, and Trichoplusia ni (PCT Pub. No. WO 89/046699; Carbonell et al., 

25 (1985) J. Virol Jd:153; Wright (1986) Nature 321:11%; Smith et al., (1983) Mol Cell Biol 
5:2156; and see generally, Fraser, et al. (1989) In Vitro Cell. Dev. Biol 25:225). 

Cells and cell culture media are commercially available for both direct and fusion 
expression of heterologous polypeptides in a baculovirus/expression system; cell culture 
technology is generally known to those skilled in the art. See, e.g.. Summers mid Smith 

30 supra. 



wo 00/22430 



PCTAJS99/23573 



-22 - 

The modified insect cells may then be grown in an appropriate nutrient medium, 
which allows for stable maintenance of the plasmid(s) present in the modified insect host. 
Where the expression product gene is under inducible control, the host may be grown to high 
density, and expression induced. Alternatively, where expression is constitutive, the product 
5 will be continuously expressed into the medixmi and the nutrient medium must be 

continuously circulated, while removing the product of interest and augmenting depleted 
nutrients. The product may be purified by such techniques as chromatography, e.g., HPLC, 
affinity chromatography, ion exchange chromatography, etc.; electrophoresis; density 
gradient centrifiigation; solvent extraction, or the like. As appropriate, the product may be 

1 0 further purified, as required, so as to remove substantially any insect proteins which are also 
secreted in the medium or result from lysis of insect cells, so as to provide a product which is 
at least substantially free of host debris, e.g., proteins, lipids and polysaccharides; 

In order to obtain protein expression, recombinant host cells derived firom the 
transformants are incubated under conditions which allow expression of the recombinant 

1 5 protein encoding sequence. These conditions will vary, dependent upon the host cell selected. 
However, the conditions are readily ascertainable to those of ordinary skill in the art, based 
upon what is known in the art. 

iv. Bacterial Systems 

20 Bacterial expression techniques are known in the art. A bacterial promoter is any 

DNA sequence capable of binding bacterial RNA polymerase and initiating the downstream 
(3') transcription of a coding sequence (e.g. structural gene) into mRNA. A promoter will 
have a transcription initiation region which is usually placed proximal to the 5' end of the 
coding sequence. This transcription initiation region usually includes an RNA polymerase 

25 binding site and a transcription initiation site. A bacterial promoter may also have a second 
domain called an operator, that may overlap an adjacent RNA polymerase binding site at 
which RNA synthesis begins. The operator permits negative regulated (inducible) 
transcription, as a gene repressor protein may bind the operator and thereby inhibit 
transcription of a specific gene. Constitutive expression may occur in the absence of negative 

30 regulatory elements, such as the operator. In addition, positive regulation may be achieved by 
a gene activator protein binding sequence, which, if present is usually proximal (5') to the 
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RNA polymerase binding sequence. An example of a gene activator protein is the catabolite 
activator protein (CAP), which helps initiate transcription of the lac operon in Escherichia 
coli {E. coll) (Raibaud et al. (1984) Annu. Rev. Genet. 18: 1 73). Regulated expression may 
therefore be either positive or negative, thereby either enhancing or reducing transcription. 
5 Sequences encoding metabolic pathway enzymes provide particularly useful promoter 

sequences. Examples include promoter sequences derived jfrom sugar metabolizing enzymes, 
such as galactose, lactose {lac) (Chang et al. (1977) Nature 795:1056), and maltose. 
Additional examples include promoter sequences derived fiom biosynthetic enzymes such as 
tryptophan {trp) (Goeddel et al. (1980) Nuc. Acids Res. 8:AQ51; Yelverton et al. (1981) Nucl. 
10 Acids Res. P:731; U.S. Patent 4,738,921; EPO Publ. Nos. 036 776 and 121 775). The beta- 
lactamase (bla) promoter system (Weissmann (1981) "The cloning of interferon and other 
mistakes." In Interferon 3 (ed. I. Gresser)), bacteriophage lambda PL (Shimatake et al. (1981) 
Nature 292:128) and T5 (U.S. Patent 4,689,406) promoter systems also provide useful 
promoter sequences. 

15 In addition, synthetic promoters which do not occur in nature also function as 

bacterial promoters. For example, transcription activation sequences of one bacterial or 
bacteriophage promoter may be joined with the operon sequences of another bacterial or 
bacteriophage promoter, creating a synthetic hybrid promoter (U.S. Patent 4,551,433). For 
example, the tac promoter is a hybrid trp-lac promoter comprised of both trp promoter and 

20 lac operon sequences that is regulated by the lac repressor (Amann et al. (1983) Gene 

25:167; de Boer et al. (1983) Proc. Natl. Acad. Sci. 80:21). Furthermore, abacterial promoter 
can include naturally occurring promoters of non-bacterial origin that have the ability to bind 
bacterial RNA polymerase and initiate transcription. A naturally occurring promoter of non- 
bacterial origin can also be coupled with a compatible RNA polymerase to produce high 

25 levels of expression of some genes in prokaryotes. The bacteriophage T7 RNA 

polymerase/promoter system is an example of a coupled promoter system (Studier et al. 
(1986) J. Mol. Biol. 189:113; Tabor a/. (1985) Proc Mir/. Acad Sci. 52:1074). In addition, 
a hybrid promoter can also be comprised of a bacteriophage promoter and an E. coli operator 
region (EPO Publ. No. 267 851). 

30 In addition to a functioning promoter sequence, an efficient ribosome binding site is 

also usefiil for the expression of foreign genes in prokaryotes. In E. coli, the ribosome 
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binding site is called the Shine-Dalgamo (SD) sequence and includes an initiation codon 
(ATG) and a sequence 3-9 nucleotides in length located 3-1 1 nucleotides upstream of the 
initiation codon (Shine et al. (1975) Nature 254:34). The SD sequence is thought to promote 
binding of mRNA to the ribosome by the pairing of bases between the SD sequence and the 
5 3' end ofE. coli 16S rRNA (Steitz et al. (1979) "Genetic signals and nucleotide sequences in 
messenger RNA." In Biological Regulation and Development: Gene Expression (ed. R.F. 
Goldberger)). To express eukaryotic genes and prokaryotic genes with weak ribosome- 
binding site, it is often necessary to optimize the distance between the SD sequence and the 
ATG of the eukaryotic gene (Sambrook et al. (1989) "Expression of cloned genes in 

1 0 Escherichia coli." In Molecular Cloning: A Laboratory Manual). 

A DNA molecule may be expressed intracellularly. A promoter sequence may be 
directly linked with the DNA molecule, in which case the first amino acid at the N-terminus 
will always be a methionine, which is encoded by the ATG start codon. If desired, 
methionine at the N-terminus may be cleaved from the protein by in vitro incubation with 

1 5 cyanogen bromide or by either in vivo or in vitro incubation with a bacterial methionine N- 
terminal peptidase (EPO Publ. No. 219 237). 

Fusion proteins provide an alternative to direct expression. Usually, a DNA sequence 
encoding the N-terminal portion of an endogenous bacterial protein, or other stable protein, is 
fused to the 5' end of heterologous coding sequences. Upon expression, this construct will 

20 provide a fusion of the two amino acid sequences. For example, the bacteriophage lambda 
cell gene can be linked at the 5' terminus of a foreign gene and expressed in bacteria. The 
resulting fiision protein preferably retains a site for a processing enzyme (factor Xa) to cleave 
the bacteriophage protein from the foreign gene (Nagai et al. (1984) Nature 309:810). Fusion 
proteins can also be made with sequences from the lacZ (Jia et al. (1987) Gene 60:197), trpE 

25 (Allen et al. (1 987) J. Biotechnol. 5 ;93; Makoff et al. (1 989) J. Gen. Microbiol. 135: 1 1), and 
Chey (EPO Publ. No. 324 647) genes. The DNA sequence at the junction of the two amino 
acid sequences may or may not encode a cleavable site. Another example is a ubiquitin fusion 
protein. Such a fusion protein is made with the ubiquitin region that preferably retains a site 
for a processing enzyme (e.g. ubiquitin specific processing-protease) to cleave the ubiquitin 

30 from the foreign protein. Through this method, native foreign protein can be isolated (Miller 
etal. (19^9) Bio/Technology 7:698). 
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Altematively, foreign proteins can also be secreted from the cell by creating chimeric 
DNA molecules that encode a fusion protein comprised of a signal peptide sequence 
fragment that provides for secretion of the foreign protein in bacteria (U.S. Patent 4,336,336). 
The signal sequence fragment usually encodes a signal peptide comprised of hydrophobic 
5 amino acids which direct the secretion of the protein from the cell. The protein is either 

secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 
between the iimer and outer membrane of the cell (gram-negative bacteria). Preferably there 
are processing sites, which can be cleaved either in vivo or in vitro encoded between the 
signal peptide fragment and the foreign gene. 

1 0 DNA encoding suitable signal sequences can be derived from genes for secreted 

bacterial proteins, such as iheE. colt outer membrane protein gene (ompA) (Masui et al. 
(1983), in: Experimental Manipulation of Gene Expression; Ghrayeb et al. (1984) EMBO J. 
i:2437) and the E. coli alkaline phosphatase signal sequence (phoA) (Oka et al. (1985) Proc. 
Natl. Acad. Sci. 82:1112). As an additional example, the signal sequence of the alpha- 

1 5 amylase gene from various Bacillus strains can be used to secrete heterologous proteins from 
B. subtilis (Palva et al. (1982) Proc. Natl. Acad. Sci. USA 7P:5582; EPO Publ. No. 244 042). 

Usually, franscription termination sequences recognized by bacteria are regulatory 
regions located 3' to the translation stop codon, and thus together with the promoter flank the 
coding sequence. These sequences direct the transcription of an mRNA which can be 

20 translated into the polypeptide encoded by the DNA. Transcription termination sequences 
frequently include DNA sequences of about 50 nucleotides capable of forming stem loop 
structures that aid in terminating transcription. Examples include transcription termination 
sequences derived from genes with strong promoters, such as the trp gene in E. coli as well as 
other biosynthetic genes. 

25 Usually, the above described components, comprising a promoter, signal sequence (if 

desired), coding sequence of interest, and transcription termination sequence, are put together 
into expression constructs. Expression constructs are often maintained in a rephcon, such as 
an extrachromosomal element (e.g., plasmids) capable of stable maintenance in a host, such 
as bacteria. The replicon will have a replication system, thus allowing it to be maintained in a 

30 prokaryotic host either for expression or for cloning and ampUfication. In addition, a replicon 
may be either a high or low copy number plasmid. A high copy number plasmid will 
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generally have a copy number ranging from about 5 to about 200, and usually about 10 to 
about 150. A host containing a high copy number plasmid will preferably contain at least 
about 10, and more preferably at least about 20 plasmids. Either a high or low copy number 
vector may be selected, depending upon the effect of the vector and the foreign protein on the 
5 host. 

Alternatively, the expression constructs can be integrated into the bacterial genome 
with an integrating vector. Integrating vectors usually contain at least one sequence 
homologous to the bacterial chromosome that allows the vector to integrate. Integrations 
appear to result from recombinations between homologous DNA in the vector and the 

1 0 bacterial chromosome. For example, integrating vectors constructed with DNA from various 
Bacillus strains integrate into the Bacillus chromosome (EPO Publ. No. 127 328). Integrating 
vectors may also be comprised of bacteriophage or transposon sequences. 

Usually, extrachromosomal and integrating expression constructs may contain 
selectable markers to allow for the selection of bacterial strains that have been fransformed. 

15 Selectable markers can be expressed in the bacterial host and may include genes which render 
bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, kanamycin 
(neomycin), and tetracycline (Davies et al. (1978) Amu. Rev. Microbiol. 32:469). Selectable 
markers may also include biosynthetic genes, such as those in the histidine, tryptophan, and 
leucine biosynthetic pathways. 

20 Alternatively, some of the above described components can be put together in 

transformation vectors. Transformation vectors are usually comprised of a selectable ma"ket 
that is either maintained in a repUcon or developed into an integrating vector, as described 
above. 

Expression and transformation vectors, either extra-chromosomal replicons or 
25 integrating vectors, have been developed for transformation into many bacteria. For example, 
expression vectors have been developed for, inter alia, the following bacteria: Bacillus 
subtilis (Palva et al. (1982) Proc. Natl. Acad. Sci. USA 79:5582; EPO Publ. Nos. 036 259 and 
063 953; PCT Publ. No. WO 84/04541), Escherichia coh (Shimatake et al. (1981) Nature 
292:12%; Amann et al. (1985) Gene 40:m; Studier et al. (1986)7. Mol. Biol. 189:113; EPO 
30 Publ. Nos. 036 776, 136 829 and 136 907), Streptococcus cremoris (Powell et al. (1988) 
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Appl. Environ. Microbiol. 54:655); Streptococcus lividans (Powell et al. (1988) Appl. 

Environ. Microbiol. 54:655), Streptomyces lividans (U.S. Patent 4,745,056). 

Methods of introducing exogenous DNA into bacterial hosts are well-known in the 

art, and usually include either the transformation of bacteria treated with CaCh or other 
5 agents, such as divjilent cations and DMSO. DNA can also be introduced into bacterial cells 

by electroporation. Transformation procedures usually vary with the bacterial species to be 

transformed. (See e.g., use of Bacillus: Masson etal. (19S9) FEMS Microbiol. Lett. (50:273; 

Palva et al. (1982) Proc. Natl. Acad. Sci. USA 7P:5582; EPO Publ. Nos. 036 259 and 063 

953; PCT Publ. No. WO 84/04541; use of Campylobacter: Miller et al. (1988) Proc. Natl. 
10 Acad. Sci. 55:856; and Wang et al. (1990) J. Bacteriol. 172:949; use of Escherichia coli: 

Cohen era/. (1973) Prac. Natl. Acad Sci. (59:2110; Dower a/. (1988) Nucleic Acids Res. 

7(5:6127; Kushner (1978) "An improved method for transformation of Escherichia coli with 

ColEl -derived plasmids. In Genetic Engineering: Proceedings of the International 

Symposium on Genetic Engineering (eds. H.W. Boyer and S. Nicosia); Mandel et al. (1970) 
15 J. Mol. Biol. 55:159; Taketo (1988) Biochim. Biophys. Acta 949:31S; use of Lactobacillus: 

Chassy et al. (1987) FEMS Microbiol. Lett. 44:173; use of Pseudomonas: Fiedler et al. 

(1988) Anal. Biochem 170:3S; use of Staphylococcus: Augustin et al. (1990) FEMS 

Microbiol. Lett. (5(5:203; use of Streptococcus: Barany et al. (1980) J. Bacteriol. 144:698; 

Harlander (1987) "Transformation of Streptococcus lactis by electroporation, in: 
20 Streptococcal Genetics (ed. J. Ferretti and R. Curtiss III); Perry et al. (1981) Infect. Immun. 

52:1295; Powell et al. {\9%%)Appl. Environ. Microbiol. 54:655; Somkuti et al. (1987) Proc. 

4th Evr. Cong. Biotechnology 7:412. 



V. Yeast Expression 

25 Yeast expression systems are also known to one of ordinary skill in the art. A yeast 

promoter is any DNA sequence capable of binding yeast RNA polymerase and initiating the 
downstream (3') transcription of a coding sequence (e.g. structural gene) into mRNA. A 
promoter will have a transcription initiation region which is usually placed proximal to the 5' 
end of the coding sequence. This transcription initiation region usually includes an RNA 

30 polymerase binding site (the "TATA Box") and a transcription initiation site. A yeast 
promoter may also have a second domain called an upstream activator sequence (UAS), 
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which, if present, is usually distal to the structural gene. The UAS permits regulated 
(inducible) expression. Constitutive expression occurs in the absence of a UAS. Regulated 
expression may be either positive or negative, thereby either enhancing or reducing 
transcription. 

5 Yeast is a fermenting organism with an active metabohc pathway, therefore sequences 

encoding enzymes in the metabolic pathway provide particularly useful promoter sequences. 
Examples include alcohol dehydrogenase (ADH) (EPO Publ. No. 284 044), enolase, 
glucokinase, glucose-6-phosphate isomerase, glyceraldehyde-3-phosphate-dehydrogenase 
(GAP or GAPDH), hexokinase, phosphofructokinase, 3-phosphoglycerate mutase, and 

10 pyruvate kinase (PyK) (EPO Publ. No. 329 203). The yeast PH05 gene, encoding acid 

phosphatase, also provides usefixl promoter sequences (Myanohara et al. (1983) Proc. Natl. 
Acad. Sci. USA 80:1). 

In addition, synthetic promoters which do not occur in nature also function as yeast 
promoters. For example, UAS sequences of one yeast promoter may be joined with the 

15 transcription activation region of another yeast promoter, creating a synthetic hybrid 

promoter. Examples of such hybrid promoters include the ADH regulatory sequence linked to 
the GAP transcription activation region (U.S. Patent Nos. 4,876,197 and 4,880,734). Other 
examples of hybrid promoters include promoters which consist of the regulatory sequences of 
either the ADH2, GAL4, GALIO, OR PH05 genes, combined with the transcriptional 

20 activation region of a glycolytic en2yme gene such as GAP or PyK (EPO Publ. No. 164 556). 
Furthermore, a yeast promoter can include naturally occurring promoters of non-yeast origin 
that have the ability to bind yeast RMA polymerase and initiate transcription. Examples of 
such promoters include, inter alia, (Cohen et al. (1980) Proc. Natl. Acad. Sci. USA 77:1078; 
Henikoff e/ al. (1981) Nature 253:835; Hollenberg et al. (1981) Curr. Topics Microbiol. 

25 Immunol. 96: 11 9 ; Hollenberg et al. ( 1 979) "The Expression of Bacterial Antibiotic 
Resistance Genes in the Yeast Saccharomyces cerevisiae," in: Plasmids of Medical, 
Environmental and Commercial Importance (eds. K.N. Timmis and A. Puhler); Mercerau- 
Puigalon et al. (1980) Gene 7i:163; Panthier et al. (1980) Curr. Genet. 2:109;). 

A DNA molecule may be expressed intracellularly in yeast. A promoter sequence 

30 may be directly linked with the DNA molecule, in which case the first amino acid at the N- 
terminus of the recombinant protein will always be a methionine, which is encoded by the 
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ATG start codon. If desired, methionine at the N-terminus may be cleaved from the protein 
by in vitro incubation with cyanogen bromide. 

Fusion proteins provide an alternative for yeast expression systems, as well as in 
mammalian, plant, baculovirus, and bacterial expression systems. Usually, a DNA sequence 
5 encoding the N-terminal portion of an endogenous yeast protein, or other stable protein, is 
fused to the 5' end of heterologous coding sequences. Upon expression, this construct will 
provide a fusion of the two amino acid sequences. For example, the yeast or human 
superoxide dismutase (SOD) gene, can be linked at the 5' terminus of a foreign gene and 
expressed in yeast. The DNA sequence at the jimction of the two amino acid sequences may 

1 0 or may not encode a cleavable site. See e.g., EPO Publ. No. 196056. Another example is a 
ubiquitin fusion protein. Such a fusion protein is made with the ubiquitin region that 
preferably retains a site for a processing enzyme (e.g. ubiquitin-specific processing protease) 
to cleave the ubiquitin from the foreign protein. Through this method, therefore, native 
foreign protein can be isolated (e.g., WO88/024066). 

1 5 Alternatively, foreign proteins can also be secreted from the cell into the growth 

media by creating chimeric DNA molecules that encode a fusion protein comprised of a 
leader sequence fragment that provide for secretion in yeast of the foreign protein. Preferably, 
there are processing sites encoded between the leader fragment and the foreign gene that can 
be cleaved either in vivo or in vitro. The leader sequence fragment usually encodes a signal 

20 peptide comprised of hydrophobic amino acids which direct the secretion of the protein from 
the cell. 

DNA encoding suitable signal sequences can be derived from genes for secreted yeast 
proteins, such as the yeast invertase gene (EPO Publ. No. 012 873; JPO Publ. No. 
62:096,086) and the A-factor gene (U.S. Patent 4,588,684). Alternatively, leaders of non- 
25 yeast origin, such as an interferon leader, exist that also provide for secretion in yeast (EPO 
Publ. No. 060 057). 

A preferred class of secretion leaders are those that employ a fragment of the yeast 
alpha-factor gene, which contains both a "pre" signal sequence, and a "pro" region. The types 
of alpha-factor fragments that can be employed include the full-length pre-pro alpha factor 
30 leader (about 83 amino acid residues) as well as truncated alpha-factor leaders (usually about 
25 to about 50 amino acid residues) (U.S. Patent Nos. 4,546,083 and 4,870,008; EPO Publ. 
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No. 324 274). Additional leaders employing an alpha-factor leader fragment that provides for 
secretion include hybrid alpha-factor leaders made with a presequence of a first yeast, but a 
pro-region from a second yeast alpha factor. (See e.g., PCT Publ. No. WO 89/02463.) 
Usually, transcription termination sequences recognized by yeast are regulatory 
5 regions located 3' to the translation stop codon, and thus together with the promoter flank the 
coding sequence. These sequences direct the transcription of an mRNA which can be 
translated into the polypeptide encoded by the DNA. Examples of transcription terminator 
sequence and other yeast-recognized termiiiation sequences, such as those coding for 
glycolytic enzymes. 

1 0 Usually, the above described components, comprising a promoter, leader (if desired), 

coding sequence of interest, and transcription termination sequence, are put together into 
expression constructs. Expression constructs are often maintained in a replicon, such as an 
extrachromosomal element (e.g., plasmids) capable of stable maintenance in a host, such as 
yeast or bacteria. The replicon may have two replication systems, thus allowing it to be 

15 maintained, for example, in yeast for expression and in a prokaryotic host for cloning and 
amplification. Examples of such yeast-bacteria shuttle vectors include YEp24 (Botstein et al. 
(1979) Gene 5:17-24), pCl/1 (Brake et al. (1984) Proc. Natl. Acad. Sci USA 57:4642-4646), 
and YRpl7 (Stinchcomb et al. (1982) J. Mol. Biol. 158:151). In addition, a replicon may be 
either a high or low copy number plasmid. A high copy number plasmid will generally have a 

20 copy number ranging from about 5 to about 200, and usually about 10 to about 150. A host 
containing a high copy number plasmid will preferably have at least about 10, and more 
preferably at least about 20. Enter a high or low copy number vector may be selected, 
depending upon the effect of the vector and the foreign protein on the host. See e.g., Brake et 
al., supra. 

25 Alternatively, the expression constructs can be integrated into the yeast genome with 

an integrating vector. Integrating vectors usually contain at least one sequence homologous to 
a yeast chromosome that allows the vector to integrate, and preferably contain two 
homologous sequences flanking the expression construct. Integrations appear to result from 
recombinations between homologous DNA in the vector and the yeast chromosome (Orr- 

30 Weaver et al. (1983) Methods in Enzymol. 707:228-245). An integrating vector may be 
directed to a specific locus in yeast by selecting the appropriate homologous sequence for 
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inclusion in the vector. See Orr- Weaver et al., supra. One or more expression construct may 
integrate, possibly affecting levels of recombinant protein produced (Rine et al. (1983) Proc. 
Natl. Acad. Sci. USA 80:6750). The chromosomal sequences included in the vector can occur 
either as a single segment in the vector, which results in the integration of the entire vector, or 
5 two segments homologous to adjacent segments in the chromosome and flanking the 
expression construct in the vector, which can result in the stable integration of only the 
expression construct. 

Usually, extrachromosomal and integrating expression constructs may contain 
selectable markers to allow for the selection of yeast strains that have been transformed. 

10 Selectable markers may include biosynthetic genes that can be expressed in the yeast host, 
such as ADE2, HIS4, LEU2, TRPl, and ALG7, and the G41 8 resistance gene, which confer 
resistance in yeast cells to tunicamycin and G418, respectively. In addition, a suitable 
selectable marker may also provide yeast with the ability to grow in the presence of toxic 
compounds, such as metal. For example, the presence of CUPl allows yeast to grow in the 

15 presence of copper ions (Butt et al. (1987) Microbiol, Rev. 57:351). 

Alternatively, some of the above described components can be put together into 
transformation vectors. Transformation vectors are usually comprised of a selectable marker 
that is either maintained in a replicon or developed into an integrating vector, as described 
above. 

20 Expression and transformation vectors, either extrachromosomal repUcons or 

integrating vectors, have been developed for transformation into many yeasts. For example, 
expression vectors and methods of introducing exogenous DNA into yeast hosts have been 
developed for, inter alia, the following yeasts: Candida albicans (Kurtz, et al. (1986) Mol. 
Cell Biol (J: 142); Candida maltosa (Kunze, etal. (1985) J. Basic Microbiol. 2J:141); 

25 Hansenula polymorpha (Gleeson, et al. (1986) /. Gen. Microbiol. 752:3459; Roggenkamp et 
al. (1986) Mol. Gen. Genet. 202:302); Kluyveromyces fragilis (Das, et al. (1984) J. Bacteriol. 
7J<S:1 165); Kluyveromyces lactis (De Louvencourt et al. (1983) J. Bacteriol. 154:737; Van 
den Berg et al. (1990) Bio/Technology 5:135); Pichia guillerimondn (Kunze et al. (1985) J. 
Basic Microbiol. 25:141); Pichia pastons (Cregg, et al. (1985) Mol. Cell. Biol. 5:3376; U.S. 

30 Patent Nos. 4,837,148 and 4,929,555); Saccharomyces cerevisiae (Hinnen et al. (1978) Proc. 
Natl Acad. Sci. USA 75:1929; Ito etal. (1983) J. Bacteriol 755:163); Schizosaccharomyces 
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pombe (Beach and Nurse (1981) Nature 300:106); and Yarrowia lipolytica (Davidow, et al. 

(1985) Curr. Genet. 70:380471 Gaillardin, et al (1985) Curr. Genet. 10:49). 

Methods of introducing exogenous DNA into yeast hosts are well-known in the art, 

and usually include either the transformation of spheroplasts or of intact yeast cells treated 
5 with alkali cations. Transformation procedures usually vary with the yeast species to be 

transformed. See e.g., [Kurtz et al. (1986) Mol. Cell. Biol. 6:142; Kunze et al. (1985) J. Basic 

Microbiol. 25:141; Candida]; [Gleeson e/ a/. (1986)/. Gen. Microbiol. 752:3459; 

Roggenkamp et al. (1986) Mol. Gen. Genet. 202:302; Hansenula]; [Das et al. (1984) J. 

Bacterial. 158:1165; De Louvencourt et al. (1983) J. Bacterial. 154:\\65; Van den Berg et 
10 al. (1990) Bio/Technalogy 5:135; Kluyveromyces]; [Cregg et al. (1985) Mol. Cell. Biol. 

5:3376; Kunze e/ a/. (1985) J. Basic Microbiol. 25:141; U.S. Patent Nos. 4,837,148 and 

4,929,555; Pichia]; [Hinnene/a/. (1978) Proc. Natl. Acad. Sci. USA 75;1929; Ito et al. 

(1983) J. Bacterial. 153:163 Saccharomyces]; [Beach and Nurse (1981) A^amre 300:706; 

Schizosaccharomyces]; [Davidow et al. (1985) Curr. Genet. 10:39; Gaillardin et al. (1985) 
15 Curr. Genet. 10:49; Yarrowia]. 

Definitions 

A composition containing X is "substantially free of Y when at least 85% by weight 
of the total X+Y in the composition is X. Preferably, X comprises at least about 90% by 
20 weight of the total of X+Y in the composition, more preferably at least about 95% or even 
99% by weight. 

The term "heterologous" refers to two biological components that are not found 
together in nature. The components may be host cells, genes, or regulatory regions, such as 
promoters. Although the heterologous components are not found together in nature, they can 

25 function together, as when a promoter heterologous to a gene is operably linked to the gene. 
Another example is where a Neisserial sequence is heterologous to a mouse host cell. 

An "origin of replication" is a polynucleotide sequence that initiates and regulates 
repUcation of polynucleotides, such as an expression vector. The origin of replication behaves 
as an autonomous unit of polynucleotide replication within a cell, capable of replication 

30 under its own control. An origin of repUcation may be needed for a vector to replicate in a 
particular host cell. With certain origins of replication, an expression vector can be 
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reproduced at a high copy number in the presence of the appropriate proteins within the cell. 
Examples of origins are the autonomously replicating sequences, which are effective in yeast; 
and the viral T-antigen, effective in COS-7 cells. 

A "mutant" sequence is defined as a DNA, RNA or amino acid sequence differing 
5 from but having homology with the native or disclosed sequence. Depending on the 

particular sequence, the degree of homology between the native or disclosed sequence and 
the mutant sequence is preferably greater than 50% (e.g.. 60%, 70%, 80%, 90%, 95%, 99% 
or more) which is calculated as described above. As used herein, an "allelic variant" of a 
nucleic acid molecule, or region, for which nucleic acid sequence is provided herein is a 

10 nucleic acid molecule, or region, that occurs at essentially the same locus in the genome of 
another or second isolate, and that, due to natural variation caused by, for example, mutation 
or recombination, has a similar but not identical nucleic acid sequence. A coding region 
allelic variant typically encodes a protein having similar activity to that of the protein 
encoded by the gene to which it is being compared. An allelic variant can also comprise an 

15 alteration in the 5' or 3' untranslated regions of the gene, such as in regulatory control regions, 
(see, for example, U.S. Patent 5,753,235). 

Antibodies 

As used herein, the term "antibody" refers to a polypeptide or group of polypeptides 
20 composed of at least one antibody combining site. An "antibody combining site" is the 
three-dimensional binding space with an intemal surface shape and charge distribution 
complementary to the features of an epitope of an antigen, which allows a binding of the 
antibody with the antigen. "Antibody" includes, for example, vertebrate antibodies, hybrid 
antibodies, chimeric antibodies, humanized antibodies, altered antibodies, univalent 
25 antibodies, Fab proteins, and single domain antibodies. 

Antibodies against the proteins of the invention are useful for affinity 
chromatography, immunoassays, and distinguishing/identifying Neisseria MenB proteins. 
Antibodies elicited against the proteins of the present invention bind to antigenic 
polypeptides or proteins or protein firagments that are present and specifically associated with 
30 strains of Neisseria meningitidis MenB. In some instances, these antigens may be associated 
with specific strains, such as those antigens specific for the MenB strains. The antibodies of 
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the invention may be immobilized to a matrix and utilized in an immunoassay or on an 
affinity chromatography column, to enable the detection and/or separation of polypeptides, 
proteins or protein fi-agments or cells comprising such polypeptides, proteins or protein 
fragments. Alternatively, such polypeptides, proteins or protein fragments may be 
5 immobilized so as to detect antibodies bindably specific thereto. 

Antibodies to the proteins of the invention, both polyclonal and monoclonal, may be 
prepared by conventional methods, hi general, the protein is first used to immunize a suitable 
animal, preferably a mouse, rat, rabbit or goat. Rabbits and goats are preferred for the 
preparation of polyclonal sera due to the volume of serum obtainable, and the availabiUty of 

1 0 labeled anti-rabbit and anti-goat antibodies, hnmunization is generally performed by mixing 
or emulsifying the protein in saline, preferably in an adjuvant such as Freund's complete 
adjuvant, and injecting the mixture or emulsion parenterally (generally subcutaneously or 
intramuscularly). A dose of 50-200 |xg/injection is typically sufficient. Immunization is 
generally boosted 2-6 weeks later with one or more injections of the protein in saHne, 

1 5 preferably using Freimd's incomplete adjuvant. One may alternatively generate antibodies by 
in vitro immunization using methods known in the art, which for the purposes of this 
invention is considered equivalent to in vivo immunization. Polyclonal antisera is obtained by 
bleeding the inamunized animal into a glass or plastic container, incubating the blood at 25''C 
for one hour, followed by incubating at 4"C for 2-18 hours. The serum is recovered by 

20 centrifugation (e.g., l,000g for 10 minutes). About 20-50 ml per bleed may be obtained from 
rabbits. 

Monoclonal antibodies are prepared using the standard method of Kohler & Milstein 
(Nature (1975) 256:495-96), or a modification thereof. Typically, a mouse or rat is 
immunized as described above. However, rather than bleeding the animal to extract serum, 

25 the spleen (and optionally several large lymph nodes) is removed and dissociated into single 
cells. If desired, the spleen cells may be screened (after removal of nonspecifically adherent 
cells) by applying a cell suspension to a plate or well coated with the protein antigen. B-cells 
that express membrane-bound immunoglobulin specific for the antigen bind to the plate, and 
are not rinsed away with the rest of the suspension. Resulting B-cells, or all dissociated 

30 spleen cells, are then induced to fuse with myeloma cells to form hybridomas, and are 
cultured in a selective medium (e.g., hypoxanthine, aminopterin, thymidine medium. 
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"HAT"). The resulting hybridomas are plated by limiting dilution, and are assayed for the 
production of antibodies which bind specifically to the immunizing antigen (and which do 
not bind to unrelated antigens). The selected MAb-secreting hybridomas are then cultured 
either in vitro (e.g., in tissue culture bottles or hollow fiber reactors), or in vivo (as ascites in 
5 mice). 

If desired, the antibodies (whether polyclonal or monoclonal) may be labeled using 
conventional techniques. Suitable labels include fluorophores, chromophores, radioactive 
atoms (particularly ^^P and *^^I), electron-dense reagents, enzymes, and ligands having 
specific binding partners. Enzymes are typically detected by their activity. For example, 

1 0 horseradish peroxidase is usually detected by its ability to convert 

3,3',5,5'-tetramethylbenzidine (TMB) to a blue pigment, quantifiable with a 
spectrophotometer. "Specific binding partner" refers to a protein capable of binding a hgand 
molecule with high specificity, as for example in the case of an antigen and a monoclonal 
antibody specific therefor. Other specific binding partners include biotin and avidin or 

1 5 streptavidin, IgG and protein A, and the numerous receptor-ligand couples known in the art. 
It should be understood that the above description is not meant to categorize the various 
labels into distinct classes, as the same label may serve in several different modes. For 
example, '^^I may serve as a radioactive label or as an electron-dense reagent. HRP may 
serve as enzyme or as antigen for a MAb. Further, one may combine various labels for 

20 desired effect. For example, MAbs and avidin also require labels in the practice of this 
invention: thus, one might label a MAb with biotin, and detect its presence with avidin 
labeled with '^^I, or with an anti-biotin MAb labeled with HRP. Other permutations and 
possibilities will be readily apparent to those of ordinary skill in the art, and are considered as 
equivalents within the scope of the instant invention. 

25 Antigens, immunogens, polypeptides, proteins or protein fi-agments of the present 

invention elicit formation of specific binding partner antibodies. These antigens, 
immunogens, polypeptides, proteins or protein fi-agments of the present invention comprise 
immunogenic compositions of the present invention. Such immunogenic compositions may 
fiirther comprise or include adjuvants, carriers, or other compositions that promote or 

30 enhance or stabilize the antigens, polypeptides, proteins or protein firagments of the present 
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invention. Such adjuvants and carriers will be readily apparent to those of ordinary skill in 
the art. 



Pharmaceutical Compositions 
5 Pharmaceutical compositions can include either polypeptides, antibodies, or nucleic 

acid of the invention. The pharmaceutical compositions will comprise a therapeutically 
effective amount of either polypeptides, antibodies, or polynucleotides of the claimed 
invention. 

The term "therapeutically effective amount" as used herein refers to an amount of a 

1 0 therapeutic agent to treat, ameliorate, or prevent a desired disease or condition, or to exhibit a 
detectable therapeutic or preventative effect. The effect can be detected by, for example, 
chemical markers or antigen levels. Therapeutic effects also include reduction in physical 
symptoms, such as decreased body temperature, when given to a patient that is febrile. The 
precise effective amount for a subject will depend upon the subject's size and health, the 

15 nature and extent of the condition, and the therapeutics or combination of therapeutics 
selected for administration. Thus, it is not useful to specify an exact effective amount in 
advance. However, the effective amount for a given situation can be determined by routine 
experimentation and is within the judgment of the clinician. 

For purposes of the present invention, an effective dose will be fix)m about 0.01 mg/ 

20 kg to 50 mg/kg or 0.05 mg/kg to about 10 mg/kg of the DNA constructs in the individual to 
which it is administered. 

A pharmaceutical composition can also contain a pharmaceutically acceptable carrier. 
The term "pharmaceutically acceptable carrier" refers to a carrier for administration of a 
therapeutic agent, such as antibodies or a polypeptide, genes, and other therapeutic agents. 

25 The term refers to any pharmaceutical carrier that does not itself induce the production of 
antibodies harmful to the individual receiving the composition, and which may be 
administered without undue toxicity. Suitable carriers may be large, slowly metabolized 
macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, 
polymeric amino acids, amino acid copolymers, and mactive virus particles. Such carriers are 

30 well known to those of ordinary skill in the art. 
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Pharmaceutically acceptable salts can be used therein, for example, mineral acid salts 
such as hydrochlorides, hydrobromides, phosphates, sulfates, and the Hke; and the salts of 
organic acids such as acetates, propionates, malonates, benzoates, and the like. A thorough 
discussion of pharmaceutically acceptable excipients is available in Remington's 
5 Pharmaceutical Sciences (Mack Pub. Co., N.J. 1991). 

Pharmaceutically acceptable carriers in therapeutic compositions may contain hquids 
such as water, saline, glycerol and ethanol. Additionally, auxihary substances, such as 
wetting or emulsifying agents, pH buffering substances, and the like, may be present in such 
vehicles. Typically, the therapeutic compositions are prepared as injectables, either as liquid 
1 0 solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles 
prior to injection may also be prepared. Liposomes are included within the definition of a 
pharmaceutically acceptable carrier. 

Delivery Methods 

1 5 Once formulated, the compositions of the invention can be administered directly to 

the subject. The subjects to be treated can be animals; in particular, human subjects can be 
treated. 

Direct delivery of the compositions will generally be accomplished by injection, 
either subcutaneously, intraperitoneally, intravenously or intramuscularly or delivered to the 
20 interstitial space of a tissue. The compositions can also be admmistered into a lesion. Other 
modes of administration include oral and puhnonary administration, suppositories, and 
transdermal and transcutaneous applications, needles, and gene guns or hyposprays. Dosage 
treatment may be a single dose schedule or a multiple dose schedule. 

25 Vaccines 

Vaccines according to the invention may either be prophylactic (i.e., to prevent 
infection) or therapeutic (i.e., to treat disease after infection). 

Such vaccines comprise immunizing antigen(s) or immunogen(s), immunogenic 
polypeptide, protein(s) or protein fragments, or nucleic acids (e.g., ribonucleic acid or 
30 deoxyribonucleic acid), usually in combination with "pharmaceutically acceptable carriers," 
which include any carrier that does not itself induce the production of antibodies harmful to 
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the individual receiving the composition. Suitable carriers are typically large, slowly 
metabolized macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic 
acids, polymeric amino acids, amino acid copolymers, lipid aggregates (such as oil droplets 
or liposomes), and inactive vims particles. Such carriers are well known to those of ordinary 
5 skill in the art. Additionally, these carriers may function as immunostimulating agents 
("adjuvants"). Furthermore, the immunogen or antigen may be conjugated to a bacterial 
toxoid, such as a toxoid from diphtheria, tetanus, cholera, H. pylori, etc. pathogens. 

Preferred adjuvants to enhance effectiveness of the composition include, but are not 
limited to: (1) aluminum salts (alum), such as aluminum hydroxide, aluminum phosphate, 

10 alimiinum sulfate, etc; (2) oil-in-water emulsion formulations (with or without other specific 
immunostimulating agents such as muramyl peptides (see below) or bacterial cell wall 
components), such as for example (a) MF59 (PCT Publ. No. WO 90/14837), containing 5% 
Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing various amounts of 
MTP-PE (see below), although not required) formulated into submicron particles using a 

15 microfluidizer such as Model HOY micro fluidizer (Micro fluidics, Newton, MA), (b) SAF, 
containing 10% Squalane, 0.4% Tween 80, 5% pluronic-blocked polymer L121, and thr- 
MDP (see below) either microfluidized into a submicron emulsion or vortexed to generate a 
larger particle size emulsion, and (c) Ribi™ adjuvant system (RAS), (Ribi Immimochem, 
Hamilton, MT) containing 2% Squalene, 0.2% Tween 80, and one or more bacterial cell wall 

20 components from the group consisting of monophosphorylipid A (MPL), trehalose 

dimycolate (TDM), and cell wall skeleton (CWS), preferably MPL + CWS (Detox™); 

(3) saponin adjuvants, such as Stimulon™ (Cambridge Bioscience, Worcester, MA) may be 
used or particles generated therefrom such as ISCOMs (immunostimulating complexes); 

(4) Complete Freund's Adjuvant (CFA) and Incomplete Freund's Adjuvant (IF A); 

25 (5) cytokines, such as interleukins (e.g., IL-1, IL-2, IL-4, IL-5, IL-6, IL-7, IL-12, etc.), 

interferons (e.g., gamma interferon), macrophage colony stimulating factor (M-CSF), tumor 
necrosis factor (TNF), etc; (6) detoxified mutants of a bacterial ADP-ribosylating toxin such 
as a cholera toxin (CT), a pertussis toxin (PT), or an E. coli heat-labile toxin (LT), 
particularly LT-K63, LT-R72, CT-S109, PT-K9/G129; see, e.g., WO 93/13302 and WO 

30 92/19265; and (7) other substances that act as immunostimulating agents to enhance the 
effectiveness of the composition. Alum and MF59 are preferred. 



wo 00/22430 



PCT/US99/23573 



-39- 

As mentioned above, muramyl peptides include, but are not limited to, N-acetyl- 
muramyI-L-threonyl-D-isoglutamine(thr-MDP), N-acetyl-normuramyl-L-alanyl-D- 
isoglutamine (nor-MDP), N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(l '-2'- 
dipalmitoyl-JM-glycero-3-huydroxyphosphoryloxy)-ethylaniine (MTP-PE), etc. 
5 The vaccine compositions comprising immimogenic compositions (e.g., which may 

include the antigen, pharmaceutically acceptable carrier, and adjuvant) typically will contain 
diluents, such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary substances, such 
as wetting or emulsifying agents, pH buffering substances, and the like, may be present in 
such vehicles. Alternatively, vaccine compositions comprising immunogenic compositions 

10 may comprise an antigen, polypeptide, protein, protein fragment or nucleic acid in a 
pharmaceutically acceptable carrier. 

More specifically, vaccines comprising immunogenic compositions comprise an 
immunologically effective amount of the immunogenic polypeptides, as well as any other of 
the above-mentioned components, as needed. By "immunologically effective amount", it is 

15 meant that the administration of that amount to an individual, either in a single dose or as part 
of a series, is effective for treatment or prevention. This amount varies depending upon the 
health and physical condition of the individual to be treated, the taxonomic group of 
individual to be treated (e.g., nonhuman primate, primate, etc.), the capacity of the 
individual's immune system to synthesize antibodies, the degree of protection desired, the 

20 formulation of the vaccine, the treating doctor's assessment of the medical situation, and other 
relevant factors. It is expected that the amount will fall in a relatively broad range that can be 
determined through routine trials. 

Typically, the vaccine compositions or immunogenic compositions are prepared as 
injectables, either as liquid solutions or suspensions; solid forms suitable for solution in, or 

25 suspension in, liquid vehicles prior to injection may also be prepared. The preparation also 
may be emulsified or encapsulated in liposomes for enhanced adjuvant effect, as discussed 
above under pharmaceutically acceptable carriers. 

The immunogenic compositions are conventionally administered parenterally, e.g., by 
injection, either subcutaneously or intramuscularly. Additional formulations suitable for 

30 other modes of administration include oral and pulmonary formulations, suppositories, and 
transdermal and transcutaneous appUcations. Dosage treatment may be a single dose schedule 
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or a multiple dose schedule. The vaccine may be administered in conjunction with other 
immunoregulatory agents. 

As an alternative to protein-based vaccines, DNA vaccination may be employed (e.g., 
Robinson & Torres (1997) Seminars in Immunology 9:271-283; Donnelly et al. (1997) Annu 
5 Rev Immunol 15:61 7-648). 

Gene Delivery Vehicles 

Gene therapy vehicles for delivery of constructs, including a coding sequence of a 
therapeutic of the invention, to be delivered to the mammal for expression in the mammal, 

1 0 can be administered either locally or systemically. These constructs can utilize viral or 
non-viral vector approaches in in vivo or ex vivo modality. Expression of such coding 
sequence can be induced using endogenous mammalian or heterologous promoters. 
Expression of the coding sequence in vivo can be either constitutive or regulated. 

The invention includes gene delivery vehicles capable of expressing the contemplated 

15 nucleic acid sequences. The gene delivery vehicle is preferably a viral vector and, more 

preferably, a retroviral, adenoviral, adeno-associated viral (AAV), herpes viral, or alphaviras 
vector. The viral vector can also be an astrovirus, coronavirus, orthomyxovirus, papovavirus, 
paramyxovirus, parvovirus, picomavirus, poxvirus, or togavirus viral vector. See generally, 
Jolly (1994) Cancer Gene Therapy 1:51-64; Kimura (1994) Human Gene Therapy 

20 5:845-852; Connelly (1995) Human Gene Therapy 6:185-193; and Kaplitt (1994) Nature 
Generics 6:148-153. 

Retroviral vectors are well known in the art, including B, C and D type retroviruses, 
xenotropic retroviruses (for example, NZB-Xl, NZB-X2 and NZB9-1 (see O'Neill (1985) J. 
Virol. 53:160) polytropic retroviruses e.g., MCF and MCF-MLV (see Kelly (1983) J. Virol 
25 45:291), spumaviruses and lentiviruses. See RNA Tumor Viruses, Second Edition, Cold 
Spring Harbor Laboratory, 1985. 

Portions of the retroviral gene therapy vector may be derived from different 
retroviruses. For example, retrovector LTRs may be derived from a Murine Sarcoma Virus, a 
tRNA binding site from a Rous Sarcoma Virus, a packaging signal from a Murine Leukemia 
30 Virus, and an origin of second strand synthesis from an Avian Leukosis Vims. 
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These recombinant retroviral vectors may be used to generate transduction competent 
retroviral vector particles by introducing them into appropriate packaging cell lines (see US 
patent 5,591,624). Retrovirus vectors can be constructed for site-specific integration into host 
cell DNA by incorporation of a chimeric integrase enzyme into the retroviral particle (see 
5 W096/37626). It is preferable that the recombinant viral vector is a replication defective 
recombinant vims. 

Packaging cell lines suitable for use with the above-described retrovirus vectors are 
well known in the art, are readily prepared (see WO95/30763 and WO92/05266), and can be 
used to create producer cell lines (also termed vector cell lines or "VCLs") for the production 
10 of recombinant vector particles. Preferably, the packaging cell lines are made from human 
parent cells (e.g., HT1080 cells) or mink parent cell lines, which eliminates inactivation in 
human serum. 

Preferred retroviruses for the construction of retroviral gene therapy vectors include 
Avian Leukosis Vims, Bovine Leukemia, Virus, Murine Leukemia Virus, Mink-Cell 

1 5 Focus-Inducing Virus, Murine Sarcoma Virus, Reticuloendotheliosis Virus and Rous 

Sarcoma Virus. Particularly preferred Murine Leukemia Viruses include 4070A and 1504A 
(Hartley and Rowe (1976) J Virol 19:19-25), Abelson (ATCC No. VR-999), Friend (ATCC 
No. VR-245), Graffi, Gross (ATCC Nol VR-590), Kirsten, Harvey Sarcoma Virus and 
Rauscher (ATCC No. VR-998) and Moloney Murine Leukemia Virus (ATCC No. VR-190). 

20 Such retroviruses may be obtained from depositories or collections such as the American 
Type Culture Collection ("ATCC") in Rockville, Maryland or isolated from known sources 
using commonly available techniques. 

Exemplary known retroviral gene therapy vectors employable in this invention 
include those described in patent applications GB2200651, EP0415731, EP0345242, 

25 EP0334301, WO89/02468; WO89/05349, WO89/09271, WO90/02806, WO90/07936, 
WO94/03622, W093/25698, W093/25234, W093/1 1230, WO93/10218, WO91/02805, 
WO91/02825, WO95/07994, US 5,219,740, US 4,405,712, US 4,861,719, US 4,980,289, US 
4,777,127, US 5,591,624. See also Vile (1993) Cancer Res 53:3860-3864; Vile (1993) 
Cancer Res 53:962-967; Ram (1993) Cancer Res 53 (1993) 83-88; Takamiya (1992) J 

30 Neurosci Res 33:493-503; Baba (1993) JNeurosurg 79:729-735; Mam (1983) Cell 33:153; 
Cane (1984) Proc Natl Acad Sci 81 :6349; and Miller (1990) Human Gene Therapy 1. 



wo 00/22430 



PCT/US99/23S73 



Human adenoviral gene therapy vectors are also known in the art and employable in 
this invention. See, for example, Berkner (1988) Biotechniques 6:616 and Rosenfeld (1991) 
Science 252:431, and WO93/07283, WO93/06223, and WO93/07282. Exemplary knovra 
adenoviral gene therapy vectors employable in this invention include those described in the 
5 above referenced documents and in W094/12649, WO93/03769, W093/19191, 

W094/28938, W095/11984, WO95/00655, WO95/27071, W095/29993, W095/34671, 
WO96/05320, WO94/08026, WO94/11506, WO93/06223, W094/24299, WO95/14102, 
W095/24297, WO95/02697, W094/28152, W094/24299, WO95/09241, WO95/25807, 
WO95/05835, W094/18922 and WO95/09654. Alternatively, administration of DNA linked 

10 to killed adenovirus as described in Curiel (1992) Hum. Gene Ther. 3:147-154 may be 
employed. The gene delivery vehicles of the invention also include adenovirus associated 
virus (AAV) vectors. Leading and preferred examples of such vectors for use in this 
invention are the AAV-2 based vectors disclosed in Srivastava, WO93/09239. Most preferred 
AAV vectors comprise the two AAV inverted terminal repeats in which the native 

15 D-sequences are modified by substitution of nucleotides, such that at least 5 native 

nucleotides and up to 1 8 native nucleotides, preferably at least 10 native nucleotides up to 18 
native nucleotides, most preferably 10 native nucleotides are retained and the remaining 
nucleotides of the D-sequence are deleted or replaced with non-native nucleotides. The native 
D-sequences of the AAV inverted terminal repeats are sequences of 20 consecutive 

20 nucleotides in each AAV inverted terminal repeat (i.e., there is one sequence at each end) 
which are not involved in HP formation. The non-native replacement nucleotide may be any 
nucleotide other than the nucleotide found in the native D-sequence in the same position. 
Other employable exemplary AAV vectors are pWP-19, pWN-1, both of which are disclosed 
in Nahreini (1993) Gene 124:257-262. Another example of such an AAV vector is psub201 

25 (see Samulski (1987) J. Virol. 61:3096). Another exemplary AAV vector is the Double-D 
ITR vector. Construction of the Double-D ITR vector is disclosed in US Patent 5,478,745. 
Still other vectors are those disclosed in Carter US Patent 4,797,368 and Muzyczka US Patent 
5,139,941, ChartejeeUS Patent 5,474,935, and Kotin W094/288157. Yet a further example 
of an AAV vector employable in this invention is SSV9AFABTKneo, which contains the 

30 AFP enhancer and albumin promoter and directs expression predominantly in the liver. Its 
structure and construction are disclosed in Su (1996) Human Gene Therapy 7:463-470. 
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Additional AAV gene therapy vectors are described in US 5,354,678, US 5,173,414, US 
5,139,941, and US 5,252,479. 

The gene therapy vectors comprising sequences of the invention also include herpes 
vectors. Leading and preferred examples are herpes simplex virus vectors containing a 
5 sequence encoding a thymidine kinase polypeptide such as those disclosed in US 5,288,641 
and EP0176170 (Roizman). Additional exemplary herpes simplex virus vectors include 
HFEM/ICP6-LacZ disclosed in WO95/04139 (Wistar Institute), pHSVlac described in Geller 
(1988) Science 241 :1667-1669 and in WO90/09441 and WO92/07945, HSV Us3::pgC-lacZ 
described in Fink (1992) Human Gene Therapy 3:11-19 and HSV 7134, 2 RH 105 and GAL4 

1 0 described in EP 0453242 (Breakefield), and those deposited with the ATCC as accession 
numbers ATCC VR-977 and ATCC VR-260. 

Also contemplated are alpha virus gene therapy vectors that can be employed in this 
invention. Preferred alpha virus vectors are Sindbis viruses vectors. Togaviruses, Semliki 
Forest virus (ATCC VR-67; ATCC VR-1247), Middleberg virus (ATCC VR-370), Ross 

15 River virus (ATCC VR-373; ATCC VR-1246), Venezuelan equine encephalitis virus (ATCC 
VR923; ATCC VR-1250; ATCC VR-1249; ATCC VR-532), and those described in US 
patents 5,091,309, 5,217,879, and WO92/10578. More particularly, those alpha virus vectors 
described in U.S. Serial No. 08/405,627, filed March 15, 1995,W094/21792, WO92/10578, 
WO95/07994, US 5,091,309 and US 5,217,879 are employable. Such alpha viruses may be 

20 obtained fi-om depositories or collections such as the ATCC in Rockville, Maryland or 

isolated from known sources using commonly available techniques. Preferably, alphavirus 
vectors with reduced cytotoxicity are used (see USSN 08/679640). 

DNA vector systems such as eukarytic layered expression systems are also usefiil for 
expressing the nucleic acids of the invention. SeeWO95/07994 for a detailed description of 

25 eukaryotic layered expression systems. Preferably, the eukaryotic layered expression systems 
of the invention are derived from alphavirus vectors and most preferably from Sindbis viral 
vectors. 

Other viral vectors suitable for use in the present invention include those derived from 
pohovirus, for example ATCC VR-58 and those described in Evans, Nature 339 (1989) 385 
30 and Sabin (1973) J. Biol. Standardization 1:115; rhinovirus, for example ATCC VR-1 1 10 
and those described in Arnold (1990) J Ce// Biochem L401; pox viruses such as canary pox 
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virus or vaccinia virus, for example ATCC VR-1 1 1 and ATCC VR-2010 and those described 
in Fisher-Hoch (1989) Proc Natl Acad Sci 86:317; Flexner (1989) NYAcadSci 569:86, 
Flexner (1990) Vaccine 8:17; in US 4,603,1 12 and US 4,769,330 and WO89/01973; SV40 
virus, for example ATCC VR-305 and those described in Mulligan (1979) Nature 277:108 
5 and Madzak (1 992) J Gen Virol 73 : 1 533 ; influenza virus, for example ATCC VR-797 and 
recombinant influenza viruses made employing reverse genetics techniques as described in 
US 5,166,057 and in Enami (1990) Proc Natl Acad Sci 87:3802-3805; Enami & Palese 
(1991) J Virol 65:2711-2713 and Luytjes (1989) Cell 59:110, (see also McMichael (1983) 
NEJMed 309:13, and Yap (1978) Nature 273:238 and Nature (1979) 277:108); human 

1 0 immunodeficiency virus as described in EP-0386882 and in Buchschacher (1 992) J. Virol. 
66:2731; measles virus, for example ATCC VR-67 and VR-1247 and those described in EP- 
0440219; Aura virus, for example ATCC VR-368; Bebaru virus, for example ATCC VR-600 
and ATCC VR-1240; Cabassou virus, for example ATCC VR-922; Chikungunya virus, for 
example ATCC VR-64 and ATCC VR-1241; Fort Morgan Virus, for example ATCC 

15 VR-924; Getah virus, for example ATCC VR-369 and ATCC VR-1243; Kyzylagach virus, 
for example ATCC VR-927; Mayaro virus, for example ATCC VR-66; Mucambo virus, for 
example ATCC VR-580 and ATCC VR-1244; Ndumu virus, for example ATCC VR-371; 
Pixuna vims, for example ATCC VR-372 and ATCC VR-1245; Tonate virus, for example 
ATCC VR-925; Triniti virus, for example ATCC VR-469; Una virus, for example ATCC 

20 VR-374; Whataroa virus, for example ATCC VR-926; Y-62-33 virus, for example ATCC 
VR-375; O'Nyong virus, Eastern encephalitis virus, for example ATCC VR-65 and ATCC 
VR-1242; Western encephalitis virus, for example ATCC VR-70, ATCC VR-1251, ATCC 
VR-622 and ATCC VR-1 252; and coronavirus, for example ATCC VR-740 and those 
described in Hamre (1966) Proc Soc Exp Biol Med 121:190. 

25 Delivery of the compositions of this invention into cells is not limited to the above 

mentioned viral vectors. Other delivery methods and media may be employed such as, for 
example, nucleic acid expression vectors, polycationic condensed DNA linked or unlinked to 
killed adenovirus alone, for example see US Serial No. 08/366,787, filed December 30, 1994 
and Curiel (1992) Hum Gene Ther 3:147-154 hgand linked DNA, for example see Wu (1989) 

30 J Biol Ghent 264:16985-16987, eucaryotic cell delivery vehicles cells, for example see US 
Serial No.08/240,030, filed May 9, 1994, and US Serial No. 08/404,796, deposition of 
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photopolymerized hydrogel materials, hand-held gene transfer particle gun, as described in 
US Patent 5,149,655, ionizing radiation as described in US5,206,152 and in W092/1 1033, 
nucleic charge neutralization or fusion with cell membranes. Additional approaches are 
described in Philip (1994) Mol Cell Biol 14:241 1-2418 and in Woffendin (1994) Proc Natl 
5 ^carf 5c/ 91:1581-1585. 

Particle mediated gene transfer may be employed, for example see US Serial No. 
60/023,867. Briefly, the sequence can be inserted into conventional vectors that contain 
conventional control sequences for high level expression, and then incubated with synthetic 
gene transfer molecules such as polymeric DNA-binding cations like polylysine, protamine, 

10 and albumin, linked to cell targeting ligands such as asialoorosomucoid, as described in Wu 
& Wu (1987) J. Biol. Chem. 262:4429-4432, insulin as described in Hucked (1990) Biochem 
Pharmacol 40:253-263, galactose as described in Plank (1992) Bioconjugate Chem 
3:533-539, lactose or transferrin. 

Naked DNA may also be employed to transform a host cell. Exemplary naked DNA 

15 introduction methods are described in WO 90/1 1092 and US 5,580,859. Uptake efficiency 
may be improved using biodegradable latex beads. DNA coated latex beads are efficiently 
transported into cells after endocytosis initiation by the beads. The method may be improved 
further by treatment of the beads to increase hydrophobicity and thereby facilitate disruption 
of the endosome and release of the DNA into the cytoplasm. 

20 Liposomes that can act as gene delivery vehicles are described in U.S. 5,422,120, 

W095/13796, W094/23697, W091/14445 and EP-524,968. As described in USSN. 
60/023,867, on non-viral delivery, the nucleic acid sequences encoding a polypeptide can be 
inserted into conventional vectors that contain conventional control sequences for high level 
expression, and then be incubated with synthetic gene transfer molecules such as polymeric 

25 DNA-binding cations like polylysine, protamine, and albumin, linked to cell targeting ligands 
such as asialoorosomucoid, insulin, galactose, lactose, or transferrin. Other delivery systems 
include the use of liposomes to encapsulate DNA comprising the gene under the control of a 
variety of tissue-specific or ubiquitously-active promoters. Further non-viral delivery suitable 
for use includes mechanical delivery systems such as the approach described in Woffendin et 

30 al (1994) Proc. Natl. Acad. Sci. USA 91(24):11581-11585. Moreover, the coding sequence 
and the product of expression of such can be delivered through deposition of 



wo 00/22430 



PCT/US99/23573 



-46- 

photopolymerized hydrogel materials. Other conventional methods for gene delivery that can 
be used for dehvery of the coding sequence include, for example, use of hand-held gene 
transfer particle gun, as described in U.S. 5,149,655; use of ionizing radiation for activating 
transferred gene, as described in U.S. 5,206,152 and WO92/11033 
5 Exemplary liposome and polycationic gene delivery vehicles are those described in 

US 5,422,120 and 4,762,915; inWO 95/13796; W094/23697; and W091/14445; in EP- 
0524968; and in Stryer, Biochemistry, pages 236-240 (1975) W.H. Freeman, San Francisco; 
Szoka (1980) Biochem BiophysActa 600:1; Bayer (1979) Biochem BiophysActa 550:464; 
Rivnay {\9Bl)Meth Enzymol 149:119; Wang (1987) Proc Natl Acad Sci 84:7851; Plant 
10 (1989) Anal Biochem 176:420. 

A polynucleotide composition can comprise a therapeutically effective amount of a 
gene therapy vehicle, as the term is defined above. For purposes of the present invention, an 
effective dose will be from about 0.01 mg/ kg to 50 mg/kg or 0.05 mg/kg to about 10 mg/kg 
of the DNA constructs in the individual to which it is administered. 

15 

Delivery Methods 

Once formulated, the polynucleotide compositions of the invention can be 
administered (1) directly to the subject; (2) delivered ex vivo, to cells derived from the 
subject; or (3) in vitro for expression of recombinant proteins. The subjects to be treated can 

20 be mammals or birds. Also, human subjects can be treated. 

Direct delivery of the compositions will generally be accomplished by injection, 
either subcutaneously, intraperitoneally, transdermally or transcutaneously, intravenously or 
intramuscularly or delivered to the interstitial space of a tissue. The compositions can also be 
administered into a tumor or lesion. Other modes of administration include oral and 

25 pulmonary administration, suppositories, and transdermal appUcations, needles, and gene 
guns or hyposprays. Dosage treatment may be a single dose schedule or a multiple dose 
schedule. See WO98/20734. 

Methods for the ex vivo delivery and reimplantation of transformed cells into a subject 
are known in the art and described in e.g., W093/14778. Examples of cells useful in ex vivo 

30 appUcations include, for example, stem cells, particularly hematopoetic, lymph cells, 
macrophages, dendritic cells, or tumor cells. 
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Generally, delivery of nucleic acids for both ex vivo and in vitro applications can be 
accomplished by the following procedures, for example, dextran-mediated transfection, 
calcium phosphate precipitation, polybrene mediated transfection, protoplast fusion, 
electroporation, encapsulation of the polynucleotide(s) in liposomes, and direct 
5 microinjection of the DNA into nuclei, all well known in the art. 



Polynucleotide and Polypeptide pharmaceutical compositions 

In addition to the pharmaceutically acceptable carriers and salts described above, the 
following additional agents can be used with polynucleotide and/or polypeptide 
10 compositions. 



A. Polypeptides 

One example are poljq^eptides which include, without limitation: asialoorosomucoid 
(ASOR); transferrin; asialoglycoproteins; antibodies; antibody fragments; ferritin; 
1 5 interleukins; interferons, granulocyte, macrophage colony stimulating factor (GM-CSF), 
granulocyte colony stimulating factor (G-CSF), macrophage colony stimulating factor 
(M-CSF), stem cell factor and erythropoietin. Viral antigens, such as envelope proteins, can 
also be used. Also, proteins fix)m other invasive organisms, such as the 17 amino acid peptide 
firom the circumsporozoite protein of Plasmodium falciparum known as RE. 

20 

B. Hormones, Vitamins, Etc. 

Other groups that can be included in a pharmaceutical composition include, for 
example: hormones, steroids, androgens, estrogens, thyroid hormone, or vitamins, folic acid. 



25 C. Polyalkylenes, Polysaccharides, etc. 

Also, polyalkylene glycol can be included in a pharmaceutical compositions with the 
desired polynucleotides and/or polypeptides. In a preferred embodiment, the polyalkylene 
glycol is polyethlylene glycol. In addition, mono-, di-, or polysaccarides can be included. In a 
preferred embodiment of this aspect, the polysaccharide is dextran or DEAE-dextran. Also, 

30 chitosan and poly(lactide-co-glycoHde) may be included in a pharmaceutical composition. 
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D. Lipids, and Liposomes 

The desired polynucleotide or polypeptide can also be encapsulated in lipids or 
packaged in liposomes prior to delivery to the subject or to cells derived therefrom. 

Lipid encapsulation is generally accomplished using liposomes which are able to 
5 stably bind or entrap and retain nucleic acid or polypeptide. The ratio of condensed 
polynucleotide to lipid preparation can vary but will generally be around 1 : 1 (mg 
DNA:micromoles lipid), or more of lipid. For a review of the use of liposomes as carriers for 
delivery of nucleic acids, see, Hug and Sleight (1991) Biochim. Biophys. Acta. 1097:1-17; 
Straubinger(1983)Mer/i. Enzymol. 101:512-527. 

1 0 Liposomal preparations for use in the present invention include cationic (positively 

charged), anionic (negatively charged) and neutral preparations. Cationic liposomes have 
been shown to mediate intracellular delivery of plasmid DNA (Feigner (1987) Proc. Natl. 
Acad. Sci. USA 84:7413-7416); mRNA (Malone (1989) Proc. Natl. Acad. Sci. USA 
86:6077-6081); and purified transcription factors (Debs (1990) J. Biol. Chem. 

15 265 : 1 0 1 89- 1 0 1 92), in functional form. 

Cationic liposomes are readily available. For example, 
N(l-2,3-dioleyloxy)propyl)-N,N,N-triethylammonium (DOTMA) liposomes are available 
vmder the trademark Lipofectin, from GIBCO BRL, Grand Island, NY. (See, also, Feigner 
supra). Other commercially available liposomes include transfectace (DDAB/DOPE) and 

20 DOTAP/DOPE (Boerhinger). Other cationic liposomes can be prepared &om readily 
available materials using techniques well known in the art. See, e.g., Szoka (1978) Proc. 
Natl. Acad. Sci. USA 75:4194-4198; WO90/1 1092 for a description of the synthesis of 
DOTAP (1 ,2-bis(oleoyloxy)-3-(trimethylammonio)propane) liposomes. 

Similarly, anionic and neutral liposomes are readily available, such as from Avanti 

25 Polar Lipids (Birmingham, AL), or can be easily prepared using readily available materials. 
Such materials include phosphatidyl choline, cholesterol, phosphatidyl ethanolamine, 
dioleoylphosphatidyl choline (DOPC), dioleoylphosphatidyl glycerol (DOPG), 
dioleoylphoshatidyl ethanolamine (DOPE), among others. These materials can also be mixed 
with the DOTMA and DOTAP starting materials in appropriate ratios. Methods for making 

30 liposomes using these materials are well known in the art. 
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The liposomes can comprise multilammelar vesicles (MLVs), small unilamellar 
vesicles (SUVs), or large unilamellar vesicles (LUVs). The various liposome-nucleic acid 
complexes are prepared using methods known in the art. See e.g., Straubinger (1983) Meth. 
Immunol. 101:512-527; Szoka (1978) Proc. Natl. Acad. Sci. USA 75:4194-4198; 
5 Papahadjopoulos (1975) Biochim. Biophys. Acta 394:483; Wilson (1979) Cell 17:17); 
Deamer & Bangham (1976) Biochim. Biophys. Acta 443:629; Ostro (1977) Biochem. 
Biophys. Res. Commun. 76:836; Fraley (1979) Proc. Natl. Acad. Sci. USA 76:3348); Enoch & 
Strittmatter (1979) Proc. Natl. Acad. Sci. USA 76:145; Fraley (1980) J. Biol. Chem. (1980) 
255:10431; Szoka & Papahadjopoulos (1978) Proc. Natl. Acad. Sci. USA 75:145; and 
10 Schaefer-Ridder (1982) Science 215: 166. 

E. Lipoproteins 

In addition, lipoproteins can be included with the polynucleotide or polypeptide to be 
delivered. Examples of lipoproteins to be utihzed include: chylomicrons, HDL, IDL, LDL, 

15 and VLDL. Mutants, fragments, or fusions of these proteins can also be used. Also, 

modifications of naturally occurring lipoproteins can be used, such as acetylated LDL. These 
lipoproteins can target the delivery of polynucleotides to cells expressing lipoprotein 
receptors. Preferably, if lipoproteins are including with the polynucleotide to be delivered, no 
other targeting ligand is included in the composition. 

20 Naturally occurring lipoproteins comprise a lipid and a protein portion. The protein 

portion are known as apoproteins. At the present, apoproteins A, B, C, D, and E have been 
isolated and identified. At least two of these contain several proteins, designated by Roman 
numerals, AI, All, AIV; CI, CII, CIII. 

A lipoprotein can comprise more than one apoprotein. For example, naturally 

25 occurring chylomicrons comprises of A, B, C, and E, over time these lipoproteins lose A and 
acquire C and E apoproteins. VLDL comprises A, B, C, and E apoproteins, LDL comprises 
apoprotein B; and HDL comprises apoproteins A, C, and E. 

The amino acid sequences of these apoproteins are known and are described in, for 
example, Breslow {192,5) Annu Rev. Biochem 54:699; Law {192,6) Adv. Exp Med. Biol. 

30 151:162; ChQn{\9%6) J Biol Chem 26\:n9n-,}Uaie{\9%0) Proc Natl Acad Sci USA 
77:2465; and Utermann (1984) Hum Genet 65:232. 
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Lipoproteins contain a variety of lipids including, triglycerides, cholesterol (free and 
esters), and phopholipids. The composition of the lipids varies in naturally occurring 
lipoproteins. For example, chylomicrons comprise mainly triglycerides. A more detailed 
description of the lipid content of naturally occurring lipoproteins can be found, for example, 
5 in Meth. Enzymol. 128 (1986). The composition of the lipids are chosen to aid in 

conformation of the ^oprotein for receptor binding activity. The composition of lipids can 
also be chosen to facilitate hydrophobic interaction and association with the polynucleotide 
binding molecule. 

Naturally occurring lipoproteins can be isolated from serum by ultracentrifixgation, for 
10 instance. Such methods are described in Meth. Enzymol. {supra); Pitas (1 980) J. Biochem. 
255:5454-5460 and Mahey {1919) J Clin. Invest 64:743-750. 

Lipoproteins can also be produced by in vitro or recombinant methods by expression 
of the apoprotein genes in a desired host cell. See, for example, Atkinson (1986) Annu Rev 
Biophys Chem 15:403 and Radding (1958) Biochim Biophys Acta 30: 443. 
15 Lipoproteins can also be purchased from commercial suppUers, such as Biomedical 

Techniologies, Inc., Stoughton, Massachusetts, USA. 

Further description of lipoproteins can be found in Zuckermann et al., PCT. Apphi. 
No. US97/14465. 

20 F. Polycationic Agents 

Polycationic agents can be included, with or without lipoprotein, in a composition 
with the desired polynucleotide and/or polypeptide to be delivered. 

Polycationic agents, typically, exhibit a net positive charge at physiological relevant 
pH and are capable of neutralizing the electrical charge of nucleic acids to facilitate delivery 
25 to a desired location. These agents have both in vitro, ex vivo, and in vivo applications. 
Polycationic agents can be used to deliver nucleic acids to a living subject either 
intramuscularly, subcutaneously, etc. 

The following are examples of usefiil polypeptides as polycationic agents: polylysine, 
polyarginine, polyomithine, and protamine. Other examples of useful polypeptides include 
30 histones, protamines, human serum albimiin, DNA binding proteins, non-histone 

chromosomal proteins, coat proteins from DNA viruses, such as OX174, transcriptional 
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factors also contain domains that bind DNA and therefore may be useful as nucleic aid 
condensing agents. Briefly, transcriptional factors such as C/CEBP, c-jun, c-fos, AP-1, AP-2, 
AP-3, CPF, Prot-1, Sp-1, Oct-1, Oct-2, CREP, and TFIID contain basic domains that bind 
DNA sequences. 

5 Organic polycationic agents include: spermine, spermidine, and purtrescine. 

The dimensions and of the physical properties of a polycationic agent can be 
extrapolated from the list above, to construct other polypeptide polycationic agents or to 
produce synthetic polycationic agents. 



G. Synthetic Polycationic Agents 

Synthetic polycationic agents which are useful in pharmaceutical compositions 
include, for example, DEAE-dextran, polybrene. Lipofectin™, and lipofectAMINE™ are 
monomers that form polycationic complexes when combined with polynucleotides or 
polypeptides. 

Immunodiagnostic Assays 

Neisseria MenB antigens, or antigenic fragments thereof, of the invention can be used 
in immunoassays to detect antibody levels (or, conversely, anti-Neisseria MenB antibodies 
can be used to detect antigen levels). Immunoassays based on well defined, recombinant 
antigens can be developed to replace invasive diagnostics methods. Antibodies to Neisseria 
MenB proteins or fragments thereof within biological samples, including for example, blood 
or serum samples, can be detected. Design of the immimoassays is subject to a great deal of 
variation, and a variety of these are known in the art. Protocols for the immunoassay may be 
based, for example, upon competition, or direct reaction, or sandwich type assays. Protocols 
may also, for example, use solid supports, or may be by immunoprecipitation. Most assays 
involve the use of labeled antibody or polypeptide; the labels may be, for example, 
fluorescent, chemiluminescent, radioactive, or dye molecules. Assays which amplify the 
signals from the probe are also known; examples of which are assays which utilize biotin and 
avidin, and enzyme-labeled and mediated iramunoassays, such as ELISA assays. 

Kits suitable for immunodiagnosis and containing the appropriate labeled reagents are 
constructed by packaging the appropriate materials, including the compositions of the 
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invention, in suitable containers, along with the remaining reagents and materials (for 
example, suitable buffers, salt solutions, etc.) required for the conduct of the assay, as well as 
suitable set of assay instructions. 

5 Nucleic Acid Hybridization 

"Hybridization" refers to the association of two nucleic acid sequences to one another 
by hydrogen bonding. Typically, one sequence will be fixed to a solid support and the other 
will be free in solution. Then, the two sequences will be placed in contact with one another 
under conditions that favor hydrogen bonding. Factors that affect this bonding include: the 

10 type and volume of solvent; reaction temperature; time of hybridization; agitation; agents to 
block the non-specific attachment of the liquid phase sequence to the solid support 
(Denhardt's reagent or BLOTTO); concentration of the sequences; use of compounds to 
increase the rate of association of sequences (dextran sulfate or polyethylene glycol); and the 
stringency of the washing conditions following hybridization. See Sambrook et al. {supra) 

15 Volume 2, ch^ter 9, pages 9.47 to 9.57. 

"Stringency" refers to conditions in a hybridization reaction that favor association of 
very similar sequences over sequences that differ. For example, the combination of 
temperature and salt concentration should be chosen that is approximately 120 to 200°C 
below the calculated Tm of the hybrid under study. The temperature and salt conditions can 

20 often be determined empirically in preliminary experiments in which samples of genomic 

DNA immobilized on filters are hybridized to the sequence of interest and then washed under 
conditions of different stringencies. See Sambrook et al. at page 9.50. 

Variables to consider when performing, for example, a Southern blot are (1) the 
complexity of the DNA being blotted and (2) the homology between the probe and the 

25 sequences being detected. The total amount of the fragment(s) to be studied can vary a 

magnitude of 10, from 0.1 to l|ig for a plasmid or phage digest to 10'^ to 10'^ g for a single 
copy gene in a highly complex eukaryotic genome. For lower complexity polynucleotides, 
substantially shorter blotting, hybridization, and exposure times, a smaller amoimt of starting 
polynucleotides, and lower specific activity of probes can be used. For example, a 

30 single-copy yeast gene can be detected with an exposure time of only 1 hour starting with 1 
^g of yeast DNA, blotting for two hours, and hybridizing for 4-8 horns with a probe of 10^ 
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cpni/|i.g. For a single-copy mammalian gene a conservative approach would start with 10 jiig 
of DNA, blot overnight, and hybridize overnight in the presence of 10% dextran sulfate using 
a probe of greater than 10^ cpm/ng, resulting in an exposure time of -24 hours. 

Several factors can affect the melting temperature (Tm) of a DNA-DNA hybrid 
5 between the probe and the fragment of interest, and consequently, the appropriate conditions 
for hybridization and washing. In many cases the probe is not 100% homologous to the 
fragment. Other commonly encountered variables include the length and total G+C content of 
the hybridizing sequences and the ionic strength and formamide content of the hybridization 
buffer. The effects of all of these factors can be approximated by a single equation: 

10 Tm= 81 + 16.6(log,oCi) + 0.4(%(G + C)) - 0.6(%formamide) - 600/« - 1 .5(%mismatch) 

where Ci is the salt concentration (monovalent ions) and n is the length of the hybrid in base 
pairs (slightly modified from Meinkotli & Walil (1984) Anal. Biochem. 138:267-284). 

In designing a hybridization experiment, some factors affecting nucleic acid 
hybridization can be conveniently altered. The temperature of the hybridization and washes 

1 5 and the salt concentration during the washes are the simplest to adjust. As the temperature of 
the hybridization increases (i.e., stringency), it becomes less likely for hybridization to occur 
between sfrands that are nonhomologous, and as a result, background decreases. If the 
radiolabeled probe is not completely homologous with the immobilized fragment (as is 
frequently the case in gene family and interspecies hybridization experiments), the 

20 hybridization temperature must be reduced, and background will increase. The temperature of 
the washes affects the intensity of the hybridizing band and the degree of background in a 
similar manner. The stringency of the washes is also increased with decreasing salt 
concenfrations. 

In general, convenient hybridization temperatures in the presence of 50% formamide 
25 are 42°C for a probe with is 95% to 100% homologous to the target fragment, 37°C for 90% 
to 95% homology, and 32°C for 85% to 90%) homology. For lower homologies, formamide 
content should be lowered and temperature adjusted accordingly, using the equation above. If 
the homology between the probe and the target fragment are not knovra, the simplest 
approach is to start with both hybridization and wash conditions which are nonstringent. If 
30 non-specific bands or high background are observed after autoradiography, the filter can be 
washed at high stringency and reexposed. If the time required for exposure makes this 
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approach impractical, several hybridization and/or washing stringencies should be tested in 
parallel. 

Nucleic Acid Probe Assays 
5 Methods such as PCR, branched DNA probe assays, or blotting techniques utilizing 

nucleic acid probes according to the invention can determine the presence of cDNA or 
mRNA. A probe is said to "hybridize" with a sequence of the invention if it can form a 
duplex or double stranded complex, which is stable enough to be detected. 

The nucleic acid probes will hybridize to the Neisseria! nucleotide sequences of the 

1 0 invention (including both sense and antisense strands). Though many different nucleotide 
sequences will encode the amino acid sequence, the native Neisserial sequence is preferred 
because it is the actual sequence present in cells. mjRNA represents a coding sequence and so 
a probe should be complementary to the coding sequence; single-stranded cDNA is 
complementary to mRNA, and so a cDNA probe should be complementary to the non-coding 

15 sequence. 

The probe sequence need not be identical to the Neisserial sequence (or its 
complement) — some variation in the sequence and length can lead to increased assay 
sensitivity if the nucleic acid probe can form a duplex with target nucleotides, which can be 
detected. Also, the nucleic acid probe can include additional nucleotides to stabilize the 

20 formed duplex. Additional Neisserial sequence may also be helpful as a label to detect the 
formed duplex. For example, a non-complementary nucleotide sequence may be attached to 
the 5' end of the probe, with the remainder of the probe sequence being complementary to a 
Neisserial sequence. Alternatively, non-complementary bases or longer sequences can be 
interspersed into the probe, provided that the probe sequence has sufficient complementarity 

25 with the a Neisserial sequence in order to hybridize therewith and thereby form a duplex 
which can be detected. 

The exact length and sequence of the probe will depend on the hybridization 
conditions, such as temperature, salt condition and the like. For example, for diagnostic 
applications, depending on the complexity of the analyte sequence, the nucleic acid probe 

30 typically contains at least 1 0-20 nucleotides, preferably 1 5-25, and more preferably at least 
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30 nucleotides, although it may be shorter than this. Short primers generally require cooler 
temperatures to form sufficiently stable hybrid complexes with the template. 

Probes may be produced by synthetic procedures, such as the triester method of 
Matteucci et al. (J. Am. Chem. Soc. (1981) 103:3185), or according to Urdea et al. (Proc. 
5 Natl. Acad. Sci. USA (1983) 80: 7461), or using commercially available automated 
oligonucleotide synthesizers. 

The chemical nature of the probe can be selected according to preference. For certain 
applications, DNA or RNA are appropriate. For other applications, modifications may be 
incorporated e.g., backbone modifications, such as phosphorothioates or 
1 0 methylphosphonates, can be used to increase in vivo half-life, alter RNA affinity, increase 
nuclease resistance etc. (e.g., see Agrawal & Iyer (1995) Curr Opin Biotechnol 6:12-19; 
Agrawal (1996) TIBTECH 14:376-387); analogues such as peptide nucleic acids may also be 
used (e.g., see Corey (1997) TIBTECH 15:224-229; Buchardt et al. (1993) TIBTECH 
386). 

1 5 One example of a nucleotide hybridization assay is described by Urdea et al. in 

international patent application WO92/02526 (see also U.S. Patent 5,124,246). 

Alternatively, the polymerase chain reaction (PGR) is another well-known means for 
detecting small amounts of target nucleic acids. The assay is described in: MuUis et al. (Meth. 
Enzymol. (1987) 155: 335-350); US patent 4,683,195; and US patent 4,683,202. Two 

20 "primer" nucleotides hybridize with the target nucleic acids and are used to prime the 

reaction. The primers can comprise sequence that does not hybridize to the sequence of the 
amplification target (or its complement) to aid with duplex stability or, for example, to 
incorporate a convenient restriction site. Typically, such sequence will flank the desired 
Neisserial sequence. 

25 A thermostable polymerase creates copies of target nucleic acids from the primers 

using the original target nucleic acids as a template. After a threshold amount of target 
nucleic acids are generated by the polymerase, they can be detected by more traditional 
methods, such as Southern blots. When using the Southern blot method, the labeled probe 
will hybridize to the Neisserial sequence (or its complement). 

30 Also, mRNA or cDNA can be detected by traditional blotting techniques described in 

Sambrook et al (supra). mRNA, or cDNA generated from mRNA using a polymerase 
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enzyme, can be purified and separated using gel electrophoresis. The nucleic acids on the gel 
are then blotted onto a solid support, such as nitrocellulose. The soUd support is exposed to a 
labeled probe and then washed to remove any unhybridized probe. Next, the duplexes 
containing the labeled probe are detected. Typically, the probe is labeled with a radioactive 
5 moiety. 

EXAMPLES 

The invention is based on the 961 nucleotide sequences from the genome of 
N. meningitidis set out in Appendix C, SEQ ID NOs: 1-961, which together represent 

10 substantially the complete genome of serotype B of 7^. meningitidis, as well as the full length 
genome sequence shown in Appendix D, SEQ ID NO 1068. 

It will be self-evident to the skilled person how this sequence information can be 
utilized according to the invention, as above described. 

The standard techniques and procedures which may be employed in order to perform 

15 the invention (e.g. to utilize the disclosed sequences to predict polypeptides useful for 
vaccination or diagnostic purposes) were summarized above. This summary is not a 
limitation on the invention but, rather, gives examples that may be used, but are not required. 

These sequences are derived from contigs shown in Appendix C (SEQ ID NOs 1-961) 
and from the full length genome sequence shown in Appendix D (SEQ ID NO 1068), which 

20 were prepared during the sequencing of the genome ofN. meningitidis (strain B). The full 
length sequence was assembled using the TIGR Assembler as described by G.S. Sutton et al., 
TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects, Genome 
Science and Technology, 1:9-19 (1995) [see also R. D. Fleischmann, et al., Science 269, 496- 
512 (1995); C. M. Eraser, et al., Science 270, 397-403 (1995); C. J. Bult, et al.. Science 273, 

25 1058-73 (1996); C. M. Eraser, et. al, Nature 390, 580-586 (1997); J.-F. Tomb, et. al.. Nature 
388, 539-547 (1997); H. P. Klenk, et al., Nature 390, 364-70 (1997); C. M. Eraser, et al., 
Science 281, 375-88 (1998); M. J. Gardner, et al., Science 282, 1126-1132 (1998); K. E. 
Nelson, et al., Nature 399, 323-9 (1999)]. Then, using the above-described methods, putative 
translation products of the sequences were determined. Computer analysis of the translation 

30 products were determined based on database comparisons. Corresponding gene and protein 
sequences, if any, were identified in Neisseria meningitidis (Strain A) and Neisseria 
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gonorrhoeae. Then the proteins were expressed, purified, and characterized to assess their 
antigenicity and immunogenicity. 

In particular, the following methods were used to express, purify, and biochemically 
characterize the proteins of the invention. 

5 

Chromosomal DNA Preparation 

N. meningitidis strain 2996 was grown to exponential phase in 100 ml of GC medium, 
harvested by centrifugation, and resuspended in 5 ml buffer (20% Sucrose, 50 mM Tris-HCl, 
50 mM EDTA, adjusted to pH 8.0). After 10 minutes incubation on ice, the bacteria were 

10 lysed by adding 10 ml lysis solution (50 mM NaCl, 1% Na-Sarkosyl, 50 |j.g/ml Proteinase K), 
and the suspension was incubated at 37°C for 2 hours. Two phenol extractions (equilibrated 
to pH 8) and one ChCb/isoamylalcohol (24: 1) extraction were performed. DNA was 
precipitated by addition of 0.3M sodium acetate and 2 volumes ethanol, and was collected by 
centrifugation. The pellet was washed once with 70% ethanol and redissolved in 4 ml buffer 

15 (10 mM Tris-HCl, ImM EDTA, pH 8). The DNA concentration was measured by reading 
the OD at 260 nm. 

Oligonucleotide design 

Synthetic oUgonucleotide primers were designed on the basis of the coding sequence 
of each ORF, using (a) the meningococcus B sequence when available, or (b) the 
20 gonococcus/meningococcus A sequence, adapted to the codon preference usage of 
meningococcus. Any predicted signal peptides were omitted, by deducing the 5 '-end 
ampUfication primer sequence inamediately downstream from the predicted leader sequence. 

For most ORFs, the 5' primers included two restriction enzyme recognition sites 
{BamHl-Ndel, BamHl-Nhel, or EcoRl-Nhel, depending on the gene's restriction pattern); the 
25 3' primers included a JiTzoI restriction site. This procedure was established in order to direct 
the cloning of each amplification product (corresponding to each ORF) into two different 
expression systems: pGEX-KG (using either 5amM-.^oI or EcoRl-XhoJ), andpET21b+ 
(using either iVrfel-^ol or Nhel-Xhol). 

5 '-end primer tail: CGC GGATCCCATATG (BamHl-Ndel ) 

30 CGC GGATCCGCTAGC (BamHl-Nhel) 
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CCG GAATTC T AGCTAGC (EcoRl-Nhel) 
3'-end primer tail: CCCG CTCGAG (JS^oI) 

For some ORFs, two different amplifications were performed to clone each ORF in 
the two expression systems. Two different 5' primers were used for each ORF; the same 3' 
Xhol primer was used as before: 

5'-end primer tail: GGAATTC CATATG GCCATGG (Ndel) 
5'-end primer tail: CG GGATCC {BamHl) 
Other ORFs were cloned in the pTRC expression vector and expressed as an 
amino-terminus His-tag fusion. The predicted signal peptide may be included in the final 
product. Nhel-BamHi restriction sites were incorporated using primers: 
5'-end primer tail: GATC AGCTAGC CATATG (Mel) 
3 '-end primer tail: CG GGATCC {BamHl) 
As well as containing the restriction enzyme recognition sequences, the primers 
included nucleotides which hybridizeed to the sequence to be ampUfied. The number of 
hybridizing nucleotides depended on the melting temperature of the whole primer, and was 
determined for each primer using the formulae: 

Tm = 4 (G+C)+ 2 (A+T) (tail excluded ) 

Tm= 64.9 + 0.41 (% GC) - 600/N ( whole primer ) 

The average melting temperature of the selected oligos were 65-70°C for the whole 
oligo and 50-55 °C for the hybridising region alone. 

Oligos were synthesized by a Perkin Elmer 394 DNA/RNA Synthesizer, eluted from 
the columns in 2 ml NH4-OH, and deprotected by 5 hours incubation at 56 °C. The oUgos 
were precipitated by addition of 0.3M Na-Acetate and 2 volumes ethanol. The samples were 
then centrifiiged and the pellets resuspended in either lOO^il or 1ml of water. OD260 was 
determined using a Perkin Ehner Lambda Bio spectophotometer and the concentration was 
determined and adjusted to 2-10 pmol/|j,l. 

Table 1 shows the forward and reverse primers used for each amplification. In certain 
cases, it might be noted that the sequence of the primer does not exactly match the sequence 
in the ORF. When initial amplifications are performed, the complete 5 ' and/or 3 ' sequence 
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may not be known for some meningococcal ORFs, although the corresponding sequences 
may have been identified in gonoccus. For amplification, the gonococcal sequences could 
thus be used as the basis for primer design, altered to take account of codon preference. In 
particular, the following codons may be changed: ATA->ATT; TCG^TCT; CAG->CAA; 
AAG^AAA; GAG^GAA; CGA and CGG^CGC; GGG->GGC. 

Amplification 

The standard PGR protocol was as follows: 50-200 ng of genomic DNA were used as 
a template in the presence of 20-40 |jM of each oligo, 400-800 dNTPs solution, Ix PGR 
buffer (including 1.5 mM MgCh), 2.5 units Tb^/DNA polymerase (using Perkin-Elmer 
AmphTaQ, GIBCO Platinum, Pwo DNA polymerase, or Tahara Shuzo Taq polymerase). 

In some cases, PGR was optimsed by the addition of 10|4,1 DMSO or 50 |iil 2M 
betaine. 

After a hot start (adding the polymerase during a preliminary 3 minute incubation of 
the whole mix at 95°C), each sample underwent a double-step amplification: the first 5 cycles 
were performed using as the hybridization temperature the one of the oligos excluding the 
restriction enzymes tail, followed by 30 cycles performed according to the hybridization 
temperature of the whole length oligos. The cycles were followed by a final 10 minute 
extension step at 72°C. 

The standard cycles were as follows: 





Denaturation 


Hybridisation 


Elongation 


First 5 cycles 


30 seconds 

95°C 


30 seconds 
50-55°C 


30-60 seconds 
72°C 


Last 30 cycles 


30 seconds 
95°C 


30 seconds 
55-70°C 


30-60 seconds 
72''C 



The elongation time varied according to the length of the ORF to be amplified. 

The amplifications were performed using either a 9600 or a 2400 Perkin Ehner 
GeneAmp PGR System. To check the results, 1/10 of the amphfication volume was loaded 
onto a 1-1.5% agarose gel and the size of each amplified fragment compared with a DNA 
molecular weight marker. 

The amplified DNA was either loaded directly on a 1% agarose gel or first 
precipitated with ethanol and resuspended in a suitable volume to be loaded on a 1% agarose 
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gel. The DNA fragment corresponding to the right size band was then eluted and purified 
from gel, using the Qiagen Gel Extraction Kit, following the instructions of the manufacturer. 
The final volume of the DNA fi:agment was 30|j,l or 50^1 of either water or lOmM Tris, pH 
8.5. 

5 Digestion of PGR fragments 

The purified DNA corresponding to the amplified fragment was split into 2 aliquots 
and double-digested with: 

NdeVXhoI orNhel/XhoI for cloning into pET-21b+ and fiirther expression of the 
protein as a C-terminus His-tag fiision 
1 0 BaroHUMol or EcoRVXhol for cloning into pGEX-KG and fiirther expression of the 

protein as a GST N-terminus fiision. 

For ORF 76, NheUBamHl for cloning into pTRC-HisA vector and fiirther expression 
of the protein as N-terminus His-tag fiision. 

Each purified DNA fragment was incubated (37°C for 3 hours to overnight) with 20 
1 5 units of each restriction enzyme (New England Biolabs ) in a either 30 or 40 |il final volume 
in the presence of the appropriate buffer. The digestion product was then purified using the 
QIAquick PGR purification kit, following the manufacturer's instructions, and eluted in a 
final volume of 30 (or 50) fj,l of either water or lOmM Tris-HCl, pH 8.5. The fmal DNA 
concentration was determined by 1% agarose gel electrophoresis in the presence of titrated 
20 molecular weight marker. 

Digestion of tlie cloning vectors (pET22B, pGEX-KG and pTRG-His A) 

10 ng plasmld was double-digested with 50 units of each restriction en2yme in 200 |j,l 
reaction volume in the presence of appropriate buffer by overnight incubation at 37°C. After 
loading the whole digestion on a 1% agarose gel, the band corresponding to the digested 
25 vector was purified from the gel using the Qiagen QIAquick Gel Extraction Kit and the DNA 
was eluted in 50 i^l of 10 mM Tris-HCl, pH 8.5. The DNA concentration was evaluated by 
measuring OD260 of the sample, and adjusted to 50 ng/M,l. 1 |li1 of plasmid was used for each 
cloning procedure. 
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Cloning 

The fragments corresponding to each ORF, previously digested and purified, were 
Ugated in both pET22b and pGEX-KG. In a final volume of 20 |al, a molar ratio of 3 : 1 
fragment/vector was hgated using 0.5 |j,l of NEB T4 DNA hgase (400 units/fil), in the 
5 presence of the buffer supplied by the manufacturer. The reaction was incubated at room 
temperature for 3 hours. In some experiments, Ugation was performed using the Boheringer 
"Rapid Ligation Kit", following the manufacturer's instructions. 

In order to introduce the recombinant plasmid in a suitable strain, 100 [il E. coli DH5 
competent cells were incubated with the ligase reaction solution for 40 minutes on ice, then at 
10 37°C for 3 minutes, then, after adding 800 n-l LB broth, again at 37°C for 20 minutes. The 
cells were then centrifuged at maximum speed in an Eppendorf microfiige and resuspended in 
qjproximately 200 ^il of the supernatant. The suspension was then plated on LB ampicillin 
(100 mg/ml ). 

The screening of the recombinant clones was performed by growing 5 
1 5 randomly-chosen colonies overnight at 37 °C in either 2 ml (pGEX or pTC clones) or 5ml 
(pET clones) LB broth + 100 p.g/ml ampicillin. The cells were then pelletted and the DNA 
extracted using the Qiagen QIAprep Spin Miniprep Kit, following the manufacturer's 
instructions, to a final volume of 30 y\. 5 |j,l of each individual miniprep (approximately Ig ) 
were digested with either JVi/el/X^oI or BamHllXhol and the whole digestion loaded onto a 1- 
20 1.5% agarose gel (depending on the expected msert size), in parallel with the molecular 

weight marker (1Kb DNA Ladder, GIBCO). The screening of the positive clones was made 
on the base of the correct insert size. 

Cloning 

Certain ORFs may be cloned into the pGEX-HIS vector using EcoRl-Pstl, 
25 EcoRl-Sall, or Sali-PstI cloning sites. After cloning, the recombinant plasmids may be 
introduced in the E. coU host W3 1 1 0. 



Expression 

Each ORF cloned into the expression vector may then be transformed into the strain 
suitable for expression of the recombinant protein product. 1 nl of each construct was used to 



wo 00/22430 



PCT/US99/23573 



-62- 

transfortn 30 |al ofE.coli BL21 (pGEX vector), E.coli TOP 10 (pTRC vector) or E.coli BL21- 
DE3 (pET vector), as described above. In the case of the pGEX-His vector, the same E.coli 
strain (W3 110) was used for initial cloning and expression. Single recombinant colonies 
were inoculated into 2ml LB+Amp (100 fig/ml), incubated at 37°C overnight, then diluted 
5 1:30 in 20 ml of LB+Amp (100 ng/ml) in 100 ml flasks, making sure that the ODeoo ranged 
between 0.1 and 0.15. The flasks were incubated at 30°C into gyratory water bath shakers 
until OD indicated exponential growth suitable for induction of expression (0.4-0.8 OD for 
pET and pTRC vectors; 0.8-1 OD for pGEX and pGEX-His vectors). For the pET, pTRC 
and pGEX-His vectors, the protein expression was induced by addiction of ImM PTG, 
10 whereas in the case of pGEX system the final concentration of IPTG was 0.2 mM. After 3 
hours incubation at 30°C, the final concentration of the sample was checked by OD. In order 
to check expression, 1ml of each sample was removed, centrifuged in a micro fuge, the pellet 
resuspended in PBS, and analysed by 12% SDS-PAGE with Coomassie Blue staining. The 
whole sample was centrifuged at 6000g and the pellet resuspended in PBS for fiirther use. 

15 GST-fusion proteins large-scale purification. 

A single colony was grown overnight at 37°C on LB+Amp agar plate. The bacteria 
were inoculated into 20 ml of LB+Amp liquid colture in a water bath shaker and grown 
overnight. Bacteria were diluted 1 :30 into 600 ml of fi-esh medium and allowed to grow at 
the optimal temperature (20-37°C) to OD550 0.8-1 . Protein expression was induced with 

20 0.2mM IPTG followed by three hours incubation. The culture was centrifuged at 8000 rpm 
at 4°C. The supernatant was discarded and the bacterial pellet was resuspended in 7.5 ml 
cold PBS. The cells were disrupted by sonication on ice for 30 sec at 40W using a Branson 
sonifier B-15, firozen and thawed two times and centrifuged again. The supernatant was 
collected and mixed with 150p,l Glutatione-Sepharose 4B resin (Pharmacia) (previously 

25 washed with PBS) and incubated at room temperature for 30 minutes. The sample was 

centrifuged at 700g for 5 minutes at 4C. The resin was washed twice with 10 ml cold PBS 
for 10 minutes, resuspended in 1ml cold PBS, and loaded on a disposable column. The resin 
was washed twice with 2ml cold PBS until the flow-through reached OD280 of 0.02-0.06. 
The GST-fusion protein was eluted by addition of 700|xl cold Glutathione elution buffer 

30 1 OmM reduced glutathione, 50mM Tris-HCl) and fractions collected until the OD280 was 0. 1 . 
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21^1 of each fraction were loaded on a 12% SDS gel using either Biorad SDS-PAGE 
Molecular weight standard broad range (Ml) (200, 116.25, 97.4, 66.2, 45, 31, 21.5, 14.4, 6.5 
kDa) or Amersham Rainbow Marker (M") (220, 66, 46, 30, 21 .5, 14.3 kDa) as standards. As 
the MW of GST is 26kDa, this value must be added to the MW of each GST-fusion protein. 

5 His-fusion soluble proteins large-scale purification. 

A single colony was grown overnight at 37°C on a LB + Amp agar plate. The 
bacteria were inoculated into 20ml of LB+Amp hquid culture and incubated overnight in a 
water bath shaker. Bacteria were diluted 1:30 into 600ml fresh medium and allowed to grow 
at the optimal temperature (20-37°C) to OD550 0.6-0.8. Protein expression was induced by 

1 0 addition of 1 mM IPTG and the culture further incubated for three hours. The culture was 
centrifiiged at 8000 rpm at 4°C, the supernatant was discarded and the bacterial pellet was 
resuspended in 7.5ml cold lOmM imidazole buffer (300 mM NaCl, 50 mM phosphate buffer, 
10 mM imidazole, pH 8). The cells were disrupted by sonication on ice for 30 sec at 40W 
using a Branson sonifier B-15, frozen and thawed two times and centrifiiged again. The 

1 5 supernatant was collected and mixed with 1 SOjU Ni^"^-resin (Pharmaci a) (previously washed 
with lOmM imidazole buffer) and incubated at room temperature with gentle agitation for 30 
minutes. The sample was centrifiiged at 700g for 5 minutes at 4°C. The resin was washed 
twice with 10 ml cold lOmM imidazole buffer for 10 minutes, resuspended in 1ml cold 
1 OmM imidazole buffer and loaded on a disposable colunm. The resin was washed at 4°C 

20 with 2ml cold lOmM imidazole buffer until the flow-through reached the O.D280 of 0.02- 
0.06. The resin was washed with 2ml cold 20niM imidazole buffer (300 mM NaCl, 50 mM 
phosphate buffer, 20 mM imidazole, pH 8) imtil the flow-through reached the O.D280 of 0.02- 
0.06. The His-fusion protein was eluted by addition of 700pl cold 250mM imidazole buffer 
(300 mM NaCl, 50 mM phosphate buffer, 250 mM imidazole, pH 8) and fractions collected 

25 until the O.D280 was 0.1. 21|a1 of each fraction were loaded on a 12% SDS gel. 

His-fusion insoluble proteins large-scale purification. 

A single colony was grown overnight at 37 °C on a LB + Amp agar plate. The 
bacteria were inoculated into 20 ml of LB+Amp liquid culture in a water bath shaker and 
grown overnight. Bacteria were diluted 1 :30 into 600ml fresh medium and let to grow at the 
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optimal temperature (37°C) to O.D550 0.6-0.8. Protein expression was induced by addition 
of 1 mM IPTG and the culture fiartfaer incubated for three hours. The culture was centrifuged 
at SOOOrpm at 4°C. The supernatant was discarded and the bacterial pellet was resuspended 
in 7.5 ml buffer B (urea 8M, lOmM Tris-HCl, lOOmM phosphate buffer, pH 8.8). The cells 
5 were disrupted by sonication on ice for 30 sec at 40W using a Branson sonifier B-1 5, frozen 
and thawed twice and centrifuged again. The supernatant was stored at -20°C, while the 
pellets were resuspended in 2 ml guanidine buffer (6M guanidine hydrochloride, lOOmM 
phosphate buffer, 10 mM Tris-HCl, pH 7.5) and treated in a homogenizer for 10 cycles. The 
product was centrifuged at 13000 rpm for 40 minutes. The supernatant was mixed with 

10 \50]xl Ni^'^-resin (Pharmacia) (previously washed with buffer B) and incubated at room 

temperature with gentle agitation for 30 minutes. The sample was centrifuged at 700 g for 5 
minutes at 4°C. The resm was washed twice with 10 ml buffer B for 10 minutes, 
resuspended in 1ml buffer B, and loaded on a disposable column. The resin was washed at 
room temperature with 2ml buffer B until the flow-through reached the OD280 of 0.02-0.06. 

1 5 The resin was washed with 2ml buffer C (urea 8M, 1 OmM Tris-HCl, 1 OOmM phosphate 
buffer, pH 6.3) until the flow-through reached the O.D280 of 0.02-0.06. The His-fusion 
protein was eluted by addition of 700nl elation buffer (urea 8M, lOmM Tris-HCl, lOOmM 
phosphate buffer, pH 4.5) and fractions collected imtil the OD280 was 0.1. 21|j,l of each 
fraction were loaded on a 12% SDS gel. 

20 His-fusion proteins renaturation 

10% glycerol was added to the denatured proteins. The proteins were then diluted to 
20p.g/ml using dialysis buffer 1 (10% glycerol, 0.5M arginine, SOmM phosphate buffer, 5mM 
reduced glutathione, 0.5mM oxidised glutathione, 2M urea, pH 8.8) and dialysed against the 
same buffer at 4°C for 12-14 hours. The protein was further dialysed against dialysis buffer 
25 II (10% glycerol, 0.5M arginine, 50mM phosphate buffer, 5mM reduced glutathione, 0.5mM 
oxidised glutathione, pH 8.8) for 12-14 hours at 4°C. Protein concentration was evaluated 
using the formula: 

Protein (mg/ml) = (1.55 x OD280) - (0.76 x OD260) 
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Mice immunisations 

20ng of each purified protein were used to immunise mice intraperitoneally. In the 
case of some ORFs, Balb-C mice were immunised with A1(0H)3 as adjuvant on days 1,21 
and 42, and immime response was monitored in samples taken on day 56. For other ORFs, 
5 CDl mice could be immunised using the same protocol. For other ORFs, GDI mice could be 
immunised using Freimd's adjuvant, and the same immunisation protocol was used, except 
that the immune response was measured on day 42, rather than 56. Similarly, for still other 
ORFs, CDl mice could be immunised with Freund's adjuvant, but the immune response was 
measured on day 49. 

10 ELISA assay (sera analysis) 

The acapsulated MenB M7 strain was plated on chocolate agar plates and incubated 
overnight at 37°C. Bacterial colonies were collected from the agar plates using a sterile 
dracon swab and inoculated into 7ml of Mueller-Hinton Broth (Difco) containing 0.25% 
Glucose. Bacterial growth was monitored every 30 minutes by following OD520. The 

15 bacteria were let to grow until the OD reached the value of 0.3-0.4. The culture was 

centrifuged for 10 minutes at 10000 rpm. The supernatant was discarded and bacteria were 
washed once with PBS, resuspended in PBS containing 0.025% formaldehyde, and incubated 
for 2 hours at room temperature and then overnight at 4°C with stirring. I00\x\ bacterial cells 
were added to each well of a 96 well Greiner plate and incubated overnight at 4°C. The wells 

20 were then washed three times with PBT washing buffer (0.1% Tween-20 in PBS). 200 |j,l of 
saturation buffer (2.7% Polyvinylpyrrolidone 10 in water) was added to each well and the 
plates incubated for 2 hours at 37°C. Wells were washed three times with PBT. 200 ixl of 
diluted sera (Dilution buffer: 1% BSA, 0.1% Tween-20, 0.1% NaN3 in PBS) were added to 
each well and the plates incubated for 90 minutes at 37°C. Wells were washed three times 

25 with PBT. 1 00 jil of HRP-conjugated rabbit anti-mouse (Dako) serum diluted 1 :2000 in 

dilution buffer were added to each well and the plates were incubated for 90 minutes at 37°C. 
Wells were washed three tunes with PBT buffer. 1 00 |li1 of substrate buffer for HRP (25 ml 
of citrate buffer pH5, 10 mg of 0-phenildiamme and 10 nl of H2O) were added to each well 
and the plates were left at room temperature for 20 minutes. 100 ^1 H2SO4 was added to each 
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well and OD490 was followed. The ELISA was considered positive when OD490 was 2.5 
times the respective pre-immune sera. 

FACScan bacteria Binding Assay procedure. 

The acapsulated MenB M7 strain was plated on chocolate agar plates and incubated 
5 overnight at 37°C. Bacterial colonies were collected from the agar plates using a sterile 
dracon swab and inoculated into 4 tubes containing 8ml each Mueller-Hinton Broth (Difco) 
containing 0.25% glucose. Bacterial growth was monitored every 30 minutes by following 
OD620. The bacteria were let to grow until the OD reached the value of 0.35-0.5. The culture 
was centrifuged for 10 minutes at 4000 ipm. The supernatant was discarded and the pellet 

10 was resuspended in blocking buffer (1% BSA, 0.4% NaNa) and centrifuged for 5 minutes at 
4000 rpm. Cells were resuspended in blocking buffer to reach OD620 of 0.07. 1 OO^il bacterial 
cells were added to each well of a Costar 96 well plate. 1 OO^il of diluted (1 :200) sera (in 
blocking buffer) were added to each well and plates incubated for 2 hours at 4°C. Cells were 
centrifuged for 5 minutes at 4000 rpm, the supernatant aspirated and cells washed by addition 

1 5 of 200|J,l/well of blocking buffer in each well. 1 OO^il of R-Phicoerytrin conjugated F(ab)2 
goat anti-mouse, diluted 1:100, was added to each well and plates incubated for 1 hour at 
4°C. Cells were spun down by centrifugation at 4000rpm for 5 minutes and washed by 
addition of 200|jJ/well of blocking buffer. The supernatant was aspirated and cells 
resuspended in 200^iywell of PBS, 0.25% formaldehyde. Samples were transferred to 

20 FACScan tubes and read. The condition for FACScan setting were: FLl on, FL2 and FL3 
off; FSC-H Treshold:92; FSC PMT Voltage: E 02; SSC PMT: 474; Amp. Gains 7.1; FL-2 
PMT: 539. Compensation values: 0. 

OMV preparations 

Bacteria were grown overnight on 5 GC plates, harvested with a loop and resuspended 
25 in 10 ml 20mM Tris-HCl. Heat inactivation was performed at 56°C for 30 minutes and the 
bacteria disrupted by sonication for 10' on ice ( 50% duty cycle, 50% output ). Unbroken 
cells were removed by centrifugation at 5000g for 10 minutes and the total cell envelope 
fraction recovered by centrifugation at SOOOOg at 4°C for 75 minutes. To exfract cytoplasmic 
membrane proteins from the crude outer membranes, the whole fraction was resuspended in 
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2% sarkosyl (Sigma) and incubated at room temperature for 20 minutes. The suspension was 
centrifuged at 10000^ for 10 minutes to remove aggregates, and the supernatant further 
ultracentrifuged at SOOOOg- for 75 minutes to pellet the outer membranes. The outer 
membranes were resuspended in lOmM Tris-HCl, pH8 and the protein concentration 
measured by the Bio-Rad Protein assay, using BSA as a standard. 

Whole Extracts preparation 

Bacteria were grown overnight on a GC plate, harvested with a loop and resuspended 
in 1ml of 20mM Tris-HCl. Heat inactivation was performed at 56°C for 30' minutes. 

Western blotting 

Purified protems (500ng/lane), outer membrane vesicles (5 ^g) and total cell extracts 
(25ng) derived from MenB strain 2996 were loaded on 15% SDS-PAGE and transferred to a 
nitrocellulose membrane. The transfer was performed for 2 hours at 1 50mA at 4°C, in 
transferring buffer (0.3 % Tris base, 1 .44 % glycine, 20% methanol). The membrane was 
saturated by overnight incubation at 4°C in saturation buffer (10% skimmed milk, 0.1% 
Triton XlOO in PBS). The membrane was washed twice with washing buffer (3% skimmed 
milk, 0.1% Triton XlOO in PBS) and incubated for 2 hours at 37''C with 1 :200 mice sera 
diluted in washing buffer. The membrane was washed twice and incubated for 90 minutes 
with a 1:2000 dilution of horseradish peroxidase labeled anti-mouse Ig. The membrane was 
washed twice with 0. 1% Triton XlOO in PBS and developed with the Opti-4CN Substrate Kit 
(Bio-Rad). The reaction was stopped by adding water. 

Bactericidal assay 

MC58 strain was grown overnight at 37°C on chocolate agar plates. 5-7 colonies 
were collected and used to inoculate 7ml Mueller-Hinton broth. The suspension was 
incubated at 37°C on a nutator and let to grow until OD620 was in between 0.5-0.8, The 
culture was aliquoted into sterile 1.5ml Eppendorf tubes and centrifuged for 20 minutes at 
maximum speed in a microfiige. The pellet was washed once in Gey's buffer (Gibco) and 
resuspended in the same buffer to an OD620 of 0.5, diluted 1 :20000 in Gey's buffer and stored 
at 25°C. 
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50)0.1 of Gey's buffer/1% BSA was added to each well of a 96-well tissue culture 
plate. 25p,l of diluted (1 :100) mice sera (dilution buffer: Gey's buffer/0.2% BSA) were added 
to each well and the plate incubated at 4°C. 25^1 of the previously described bacterial 
suspension were added to each well. 25{j,l of either heat-inactivated (56°C waterbath for 30 
5 minutes) or normal baby rabbit complement were added to each well. Immediately after the 
addition of the baby rabbit complement, 22p,l of each sample/well were plated on Mueller- 
Hinton agar plates (time 0). The 96-well plate was incubated for 1 hour at 37°C with rotation 
and then 22^1 of each sample/well were plated on Mueller-Hinton agar plates (time 1). After 
overnight incubation the colonies corresponding to time 0 and time Ih were counted. 

10 The following DNA and amino acid sequences are identified by titles of the following 

form: [g, m, or a] [#].[seq or pep], where "g" means a sequence from N. gonorrhoeae, "m" 
means a sequence from N. meningitidis B, and "a" means a sequence from A^. meningitidis A; 
"#" means the number of the sequence; "seq" means a DNA sequence, and "pep" means an 
amino acid sequence. For example, "gOOl.seq" refers to an A', gonorrohoeae DNA sequence, 

15 number 1 . The presence of the suffix "-1" or "-2" to these sequences indicates an additional 
sequence found for the same ORF. Further, open reading frames are identified as ORF #, 
where "#" means the number of the ORF, corresponding to the number of the sequence 
which encodes the ORF, and the ORF designations may be sufBxed with ".ng" or ".a", 
indicating that the ORF corresponds to a N. gonorrhoeae sequence or a N. meningitidis A 

20 sequence, respectively. Computer analysis was performed for the comparisons that follow 
between "g", "m", and "a" peptide sequences; and therein the "pep" suffix is implied where 
not expressly stated. 

EXAMPLE 1 

25 The following ORFs were predicted from the contig sequences and/or the frill length 

sequence using the methods herein described. 

Localization of the ORFs 

30 ORF: contig: 
279 gnm4.seq 

The following partial DNA sequence was identified in N. meningitidis <SEQ ID 962>: 
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m279.seq 

1 ATAACGCGGA TTTGCGGCTG CTTGATTTCA ACGSTTTTCA GGGCTTCGGC 
51 AAGTTTGTCG GCGGCGGGTT TCATCA6GCT GCAATGGGAA GGTACGGACA 
101 CGGGCAGCGG CAGGGCGCGT TTGGCACCGG CTTCTTTGGC GGCAGCCATG 
151 GCGCGTCCGA CGGCGGCGGC GTTGCCTGCA ATCACGATTT GTCCGGGTGA 
201 GTTGAAGTTG ACGGCTTCGA CCACTTCGCT TTGGGCGGCT TCGGCACAAA 
251 TGGCTTTAAC CTGCTCATCT TCCAAGCCGA GAATCGCCGC CATTGCGCCC 
301 ACGCCTTGCG GTACGGCGGA CTGCATCAGT TCGGCGCGCA GGCGCACGAG 
351 TTTGACCGCG TCGGCAAAAT TCAATGCGCC GGCGGCAACG AGTGCGGTGT 
401 ATTCGCCGAG 6CTGTGTCCG GCAACGGCGG CAGGCGTTTT GCCGCCCGCT 
451 TCTAAATAG 

This corresponds to the amino acid sequence <SEQ ID 963; ORF 279>: 
m279.pep 

1 ITRICGCLIS TVFRASASLS AAGFIRLQWE GTDTGSGRAR LAPASIAAflM 
51 ARPTAAA LPA ITICPGELKL TASTTSLWAA SAQMALTCSS SKPRIAAIAP 
101 TPCGTADCIS SAERRTSLTA SAKFNAPAAT SAVYSPRLCP ATAAGVLPPA 

151 SK* 

The following partial DNA sequence was identified in N.gonorrhoeae <SEQ ID 964>: 

g279 . seq 

1 atgacgcgga tttgcggctg cttgatttca acggttttga gtgtttcggc 

51 aagtttgtcg gcggcgggtt tcatcaggot gcaatgggaa ggaacggata 

101 ccggcagcgg cagggcgcgt ttggctccgg cttctttggc ggcagccatg 

151 gtgcgtccga cggcggcggc gttgcctgca atcacgactt gtccgggcga 

201 gttgaagttg acggcttcga ccacttcgcc ctgtgcggat tcggcacaaa 

251 tctgcctgac ctgttcatct tccaaaccca aaatggccgc cattgcgcct 

301 acgccttgcg gtacggcgga ctgcatcagt tcggcgcgca ggcggacgag 

351 tttgacggca tcggcaaaat ccaatgcttc ggcggcgaca agcgcggtgt 

401 attcgccgag gctgtgtccg gcaacggcgg caggcgtttt gccgcccact 

451 tccaaatag 

This corresponds to the amino acid sequence <SEQ ID 965; ORF 279.ng>: 

g279 .pep 

1 MTRICGCLIS TVLSVSASLS AAGFIRLQWE GTDTGSGRAR LAPASLAAAM 
51 VRPTAAA LPA ITTCPGELKL TASTTSPCAD SAQICLTCSS SKPKMAAIAP 
101 TPCGTADCIS SARRRTSLTA SAKSNJiSAAT SAVYSPRLCP ATAAGVLPPT 
151 SK* 

ORF 279 shows 89.5% identity over a 152 aa overlap with a predicted ORF (ORF 279.ng) 
firom N. gonorrhoeae: 

10 20 30 40 50 SO 

m279 .pep ITRICGCLISTVFRASASLSAAGFIRLQWEGTDTGSGRARLAPASLAAAMARPTAAALPA 

HIIIIIIIIII: UIIIIIIIIMIIIIIIIIIIIIIIMIIIIIIIhlMMIMI 

g 2 7 9 MTRI CGCLI STVLSVSASLSAAGFIRLQWEGTDTGSGRAKLAPASLAAAMVRPTAAALPA 

10 20 30 40 50 50 

70 80 90 100 110 120 

m279 .pep ITICPGELKLTASTTSLWAASAQMALTCSSSKPRIAAIAPTPCGTADCISSARRRTSLTA 

II lllllllllllll I III: lllllllh:|||IIIIIIIIIIIIIIIIIIIIM 
g279 ITTCPGELKLTASTTSPCADSAQICLTCSSSKPKMAAIAPTPCGTADCISSARRRTSLTA 

70 80 90 100 110 120 

130 140 150 

m279 . pep SAKFNAPAATSAVYSPRLCPATAAGVLPPASKX 

III II lllllllllllllllllllllhlll 
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The following partial DNA sequence was identified in N. meningitidis <SEQ ID 966>: 

a27 9.seq 

1 ATGACNCNGA TTTGCGGCTG CTTGATTTCA ACGGTTTNNA GGGCTTCGGC 

51 GAGTTTGTCG GCGGCGGGTT TCATGAGGCT GCAATGGGAA GGTACNGACA 

101 CNGGCAGCGG CAGGGCGCGT TTGGCGCCGG CTTCTTTGGC GGCAAGCATA 

151 GCGCGCTCGA CGGCGGCGGC ATTGCCTGCA ATCACGACTT GTCCGGGCGA 

201 GTTGAAGTTG ACGGCTTCAA CCACTTCATC CTGTGCGGAT TCGGCGCAAA 

251 TTTGTTTTAC CTGTTCATCT TCCAAGCCGA GAATCGCCGC CATTGCGCCC 

301 ACGCCTTGCG GTACGGCGGA CTGCATCAGT TCGGCGCGCA NGCGCACGAG 

351 TTTGACCGCG TCGGCAAAAT CCAATGCGCC GGCGGCAACN AGTGCGGTGT 

4 01 ATTCGCCGAN GCTGTGTCCG GCAACGGCGG CAGGCGTTTT GCCGCCCGCT 

451 TCCGAATAG 

This corresponds to the amino acid sequence <SEQ ID 967; ORF 279.a>: 

a279.pep 

1 MTXICGCLIS TVXRASASLS AAGFMRLQWE GTDTGSGRAR LAPASLAASI 
51 ARS TAAALPA ITTCPGELKL TASTTSSCAD SAQICFTCSS SKPRIAAIAP 
101 TPCGTADCIS SARXRTSLTA SAKSNAPAAT SAVYSPXLCP ATAAGVLPPA 
151 SE* 

m279/a279 ORFs 279 and 279.a showed a 88.2% identity in 1 52 aa overlap 

10 20 30 40 50 60 

in27 9 . pep ITRICGCLISTVFRASASLSAAGFIRLQWEGTDTGSGRARLAPASLAAAMARPTAAALPA 

= I I 1 = I MM ::ll I 

a279 MTXICGCLISTVXRASASLSAAGFMRLQWEGTDTGSGRARLAPASLAASIARSTAAALPA 

10 20 30 40 50 60 

70 80 90 100 110 120 

m279.pep ITICPGELKLTASTTSLWAASAQMALTCSSSKPRIAAIAPTPCGTADCISSARRRTSLTA 

I : Illlllll 

a27 9 ITTCPGELKLTASTTSSCADSAQICFTCSSSKPRIiUilAPTPCGTADCISSARXRTSLTA 

70 80 90 100 110 120 

130 140 150 

m27 9 . pep SAKFNAPAATSAVYS PRLCPATAAGVLPPASKX 

III I I I I I I I I I I I I I I I I I I I I I I II M : I 
a27 9 SAKSNAPAATSAVYSPXLCPATAAGVLPPASEX 

130 140 150 



519 and 519-1 gnm7.seq 

The following partial DNA sequence was identified in A^. meningitidis <SEQ ID 968>: 

m519.seq (partial) 

1 . . TCCGTTATCG GGCGTATGGA GTTGGACAAA ACGTTTGAAG AACGCGACGA 

51 AATCAACAGT ACTGTTGTTG CGGCTTTGGA CGAGGCGGCC GGGgCTTgGG 

101 GTGTGAAGGT TTTGCGTTAT GAGATTAAAG ACTTGGTTCC 6CCGCAAGAA 

151 ATCCTTCGCT CAATGCAGGC GCAAATTACT GCCGAACGCG AAAAACGCGC 

2 01 CCGTATCGCC GAATCCGAAG GTCGTAAAAT CGAACAAATC AACCTTGCCA 

251 GTGGTCAGCG CGAAGCCGAA ATCCAACAAT CCGAAGGCGA GGCTCAGGCT 

301 GCGGTCAATG CGTCAAATGC CGAGAAAATC GCCCGCATCA ACCGCGCCAA 

351 AGGTGAAGCG GAATCCTTGC GCCTTGTTGC CGAAGCCAAT GCCGAAGCCA 

401 TCCETCAAAT TGCCGCCGCC CTTCAAACCC AAGGCGGTGC GGATGCGGTC 

451 AATCTGAAGA TTGCGGAACA ATACGTCGCT GCGTTCAACA ATCTTGCCAA 

501 AGAAAGCAAT ACGCTGATTA TGCCCGCCAA TGTTGCCGAC ATCGGCAGCC 

551 TGATTTCTGC CGGTATGAAA ATTATCGACA GCAGCAAAAC CGCCAAaTAA 
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This corresponds to the amino acid sequence <SEQ ID 969; ORF 519>: 

m519.pep (partial) 

1 . . SVIGRMELDK TFEERDEINS TWAALDEAA GAWGVKVLRY EIKDLVPPQE 

51 ILRSMQAQIT AEREKRARIA ESEGRKIEQI NLASGQREAE IQQSEGEAQA 

101 AVNASNAEKI ARINRAKGEA ESLRLVAEAN AEAIRQIAAA LQTQGOADAV 

151 NLKIAEQYVA AFNNLAKESM TLIMPANVAD IGSLISAGMK IIDSSKTAK* 

The following partial DNA sequence was identified in N. gonorrhoeae <SEQ ID 970>: 



g519.seq 












tcattatctt 


gttggcagcc 


51 


atcctttgtc 




agcaggaagt 


101 


ggcgtttcca 


tcgcgccctg 


acggccggtt 


151 


atcgaccgcg 


tcgcctaccg 


ccattcgctg 


201 


acccagccag 


gtctgcatca 


cgcgcgataa 


251 


gcatcatcta 


tttccaagta 


accgatccca 


301 


agcaactaca 


ttatggcaat 


tacccagctt 


351 


cgttatcggg 


cgtatggagt 


tggacaaaac 


401 


tcaacagtac 


cgtcgtctcc 


gccctcgatg 


451 


gtgaaagtcc 


tccgttacga 


aatcaaggat 


501 


ccttcgcgca 


atgcaggcac 


aaattaccgc 


551 


gtattgccga 


atccgaaggc 


cgtaaaatcg 


601 


ggtcagcgtg 


aagccgaaat 


ccaacaatcc 


651 


ggtcaatgcg 


tccaatgccg 


agaaaatcgc 


701 


gcgaagcgga 


atccctgcgc 


cttgttgccg 


751 


cgtcaaattg 


ccgccgccct 


tcaaacccaa 


801 


tctgaagatt 


gcgggacaat 


acgttaccgc 


851 


aagacaatac 


gcggattaag 


cccgccaagg 


901 


aattttcggc 


ggcatgaaaa 


attttcgcca 


951 









30 This corresponds to the amino acid sequence <SEQ ID 97 1 ; ORF 5 1 9.ng>: 

g519.pep 

1 MEFFIILLAA VAVFG FKSFV VIPQQEVHW ERLGRFHRAL TAGLNILIPF 

51 IDRVAYRHSL KEIPLDVPSQ VCITRDNTQL T7DGIIYFQV TDPKLASYGS 

101 SNYIMAITQL AQTTLRSVIG RMELDKTFEE RDEINSTWS ALDEAAGAWQ 

35 151 VKVLRYEIKD LVPPQEILRA MQAQITAERE KRARIAESEG RKIEQINLAS 

201 GQREAEIQQS EGEAQAAVNA SNAEKIARIN RAKGEAESLR LVAEANAEAN 

251 RQIAAALQTQ SGADAVNLKI AGQYVTAFKN LAKEDNTRIK PAKVAEIGNP 

301 NFRRHEKPSP EAKTAK* 

40 ORF 519 shows 87.5% identity over a 200 aa overlap with a predicted ORF (ORF 519.ng) 
from N. gonorrhoeae: 

m5l9/g519 

10 20 30 

45 m519.pep SVIGRMELDKTFEERDEINSTWATiLDEAA 

llllllllllllllllllllllhllllll 
g519 YFQVTDPKLASYGSSNYIMAITQLAQTTLRSVIGRMELDKTFEERDEINSTWSALDEAA 
90 100 110 120 130 140 

50 40 50 60 70 80 90 

m519 .pep GAWGVKVLRYEIKDLVPPQEILRSMQAQITAEREKRARIAESEGRKIEQINLASGQREAE 

IIIIIIIMIIMIIIIIIIIIhlMIIIIIMIIMIIIIMIMIIIIIIIMIIII 

9519 C3AWGVKVLRYEIKDLVPPQEILRAMQAQITAEREKRARIAESEGRKIEQINLASGQREAE 
150 160 170 180 190 200 

100 110 120 130 140 150 

m5 1 9 . pep IQQSEGEAQAAVNASNAEKIARINRAKGEAESLRLVAEANAEAIRQIAAALQTQGGADAV 
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IIIMIIIIIIIIIIIMIIIIMIIIIIIMIMIIIIIIM IMIIIIIIhlllM 

IQQSEGEAQAAVNASNAEKIARINRAKGEAESLRLVAEANAEflNRQIAAALQTQSGADAV 
210 220 230 240 250 260 

160 170 180 190 200 

NLKIAEQYVAAFNNLAKESNTLIMPANVADIGSL-ISAGMKIIDSSKTAK 
HIM MhlhlMlhll I Ihlhlh : h Hill 

NLKIAGQYVTAFKNLAKEDNTRIKPAKVAEIGNPNPRRHEKPSPEAKTAK 
270 280 290 300 310 



The following partial DNA 

a519.seq 

1 ATGGAATTTT 

51 ATCCTTTGTT 

101 GGCGTTTCCA 

151 ATCGACCGCG 

201 ACCCAGCCAG 

251 GTATCATCTA 

301 AGCAACTACA 

351 CGTTATCGGG 

4 01 TCAACAGCAC 

451 GTGAAGGTTT 

501 CCTTCGCTCA 

551 GTATCGCCGA 

601 GGTCAGCGCG 

651 GGTCAATGCG 

701 GTGAAGCGGA 

751 CGTCAAATTG 

801 TCTGAAGATT 

851 AAAGCAATAC 

901 ATTTCTGCCG 



was identified in N. meningitidis <SEQ ID 972>: 



TCATTATCTT 
GTCATCCCAC 
TCGCGCCCTG 
TCGCCTACCG 
GTCTGCATCA 
TTTCCAAGTA 
TTATGGCGAT 
CGTATGGAAT 
CGTCGTCTCC 
TGCGTTATGA 
ATGCAGGCGC 
ATCCGAAGGT 
AAGCCGAAAT 
TCAAATGCCG 
ATCCTTGCGC 
CCGCCGCCCT 
GCGGAACAAT 
GCTGATTATG 
GTATGAAAAT 



GCTGGCAGCC 
AGCAGGAAGT 
ACGGCCGGTT 
CCATTCGCTG 
CGCGCGACAA 
ACCGACCCCA 
TACCCAGCTT 
TGGACAAAAC 
GCCCTCGATG 
GATTAAAGAC 
AAATTACTGC 
CGTAAAATCG 
CCAACAATCC 
AGAAAATCGC 
CTTGTTGCCG 
TCAAACCCAA 
ACGTCGCCGC 
CCCGCCAATG 
TATCGACAGC 



GTCGTTGTTT 
CCACGTTGTC 
TGAATATTTT 
AAAGAAATCC 
TACGCAGCTG 
AACTCGCCTC 
GCCCAAACGA 
GTTTGAAGAA 
AAGCCGCCGG 
TTGGTTCCGC 
TGAACGCGAA 
AACAAATCAA 
GAAGGCGAGG 
CCGCATCAAC 
AAGCCAATGC 
GGCGGTGCGG 
GTTCAACAAT 
TTGCCGACAT 
AGCAAAACCG 



TCGGCTTCAA 
GAAAGGCTCG 
GATTCCCTTT 
CTTTAGACGT 
ACTGTTGACG 
ATACGGTTCG 
CGCTGCGTTC 
CGCGACGAAA 
AGCTTGGGGT 
CGCAAGAAAT 
AAACGCGCCC 
CCTTGCCAGT 
CTCAGGCTGC 
CGCGCCAAAG 
CGAAGCCATC 
ATGCGGTCAA 
CTTGCCAAAG 
CGGCAGCCTG 
CCAAATAA 



This corresponds to the amino acid sequence <SEQ ID 973; ORF 519.a>: 

a519.pep 

1 MEFFIILLAA VVVFG FKSFV VIPCQEVHW ERLGRFHRAL TAGLNILIPF 

51 IDRVAYRHSL KEIPLDVPSQ VCITRDNTQL TVDGIIYFQV TDPKLASYGS 

101 SNYIMAITQL AQTTLRSVIG RMELDKTFEE RDEINSTWS ALDEAAGAWG 

151 VICVLRYEIKD LVPPQEILRS MQAQITAERE KRARIAESEG RKIEQINLAS 

201 GQREAEIQQS EGEAQAAVNA SNAEKIARIN RAKGEAESLR LVAEANAEAI 

251 RQIAAALQTQ GGADAVNLKI AEQYVAAFNN UUKESNTLIM PANVADIGSL 

301 ISAGMKIIDS SKTAK* 

m519/a519 ORPs 519 and 519. a showed a 99.5% identity in 199 aa 



SVIGRMELDKTFEERDEINSTWAALDEAA 



GAWGVKVLRYEIKDLVPPQEILRSMQAQITAEREKRARIAESEGRKIEQINLASGQREAE 

Illlllllllllllllllllllll I I I I I I I I I I I I I I I I HIM 

GAWGVKVLRYEIKDLVPPQEILRSMQAQITAEREKRARIAESEGRKIEQINLASGQREAE 



100 



120 



130 



140 



150 



IQQSEGEAQAAVNA3NAEKIARINRAKGEAESLRLVAEANAEAIRQIAAALQTQGGADAV 

I I I I I M I i I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

IQQSEGEAQAAVNASNAEKIARINRAKGEAESLRLVAEANAEAIRQIAAALQTQGGADAV 
210 220 230 240 250 260 
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160 170 180 190 200 

m519 .pep NLKIAEQYVAAFNNLAKESNTLIMPANVADIGSLISAGMKIIDSSKTAKX 
I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 
a519 NLKIAEQYVAAFNNLAKESNTLIMPANVADIGSLISAGMKIIDSSKTAKX 
270 280 290 300 310 



Further work revealed the following DNA sequence identified in N. meningitidis <SEQ ID 
974>: 

m519-l.seq 

1 ATGGAATTTT TCATTATCTT GTTGGTAGCC GTCGCCGTTT TCGGTTTCAA 

51 ATCCTTTGTT GTCATCCCAC AACAGGAAGT CCACGTTGTC GAAAGGCTGG 

101 GGCGTTTCCA TCGCGCCCTG ACGGcCGGTT TGAATATTTT GATTCCCTTT 

151 ATCGACCGCG TCGCCTACCG CCATTCGCTG AAAGAAATCC CTTTAGACGT 

201 ACCCAGCCAG GTCTGCATCA CGCGCGACAA TACGCAGCTG ACTGTTGACG 

251 GCATCATCTA TTTCCAAGTA ACCGACCCCA AACTCGCCTC ATACGGTTCG 

301 AGCAACTACA TTATGGCGAT TACCCAGCTT GCCCAAACGA CGCTGCGTTC 

351 CGTTATCGGG CGTATGGAGT TGGACAAAAC GTTTGAAGAA CGCGACGAAA 

401 TCAACAGTAC TGTTGTTGCG GCTTTGGACG AGGCGGCCGG GGCTTGGGGT 

451 GTGAAGGTTT TGCGTTATGA GATTAAAGAC TTGGTTCCGC CGCAAGAAAT 

501 CCTTCGCTCA ATGCAGGCGC AAATTACTGC CGAACGCGAA AAACGCGCCC 

551 GTATCGCCGA ATCCGAAGGT CGTAAAATCG AACAAATCAA CCTTGCCAGT 

601 GGTCAGCGCG AAGCCGAAAT CCAACAATCC GAAGGCGAGG CTCAGGCTGC 

651 GGTCAATGCG TCAAATGCCG AGAAAATCGC CCGCATCAAC CGCGCCAAAG 

701 GTGAAGCGGA ATCCTTGCGC CTTGTTGCCG AAGCCAATGC CGAAGCCATC 

7 51 CGTCAAATTG CCGCCGCCCT TCAAACCCAA GGCGGTGCGG ATGCGGTCAA 

801 TCTGAAGATT GCGGAACAAT ACGTCGCTGC GTTCAACAAT CTTGCCAAAG 

851 AAAGCAATAC GCTGATTATG CCCGCCAATG TTGCCGACAT CGGCAGCCTG 

901 ATTTCTGCCG GTATGAAAAT TATCGACAGC AGCAAAACCG CCAAATAA 

This corresponds to the amino acid sequence <SEQ ID 975; ORF 519-1>: 

m519-l. 

1 MEFFIILLVA VAVFG FKSFV VIPQQEVHW ERLGRFHRAL TAGLNILIPF 

51 IDRVAYRIISL KEIPLDVPSQ VCITRDNTQL TVDGIIYFQV TDPKLASYGS 

101 SNYIMAITQL AQTTLRSVIG RMELDKTFEE RDEINSTWA ALDEAAGAWG 

151 VKVLRYEIKD LVPPQEILRS MQAQITAERE KRARIAESEG RKIEQINLAS 

2 01 GQREAEIQQS EGEAQAAVNA SNAEKIARIN RAKGEAESLR LVTiEANAEAI 

2 51 RQIAAALQTQ GGADAVNLKI AEQYVAAFNN LAKESNTLIM PANVADIGSL 

301 ISAGMKIIDS SKTAK* 

The following DNA sequence was identified in N. gonorrhoeae <SEQ ID 976>: 

g519-l.seq 

1 ATGGAATTTT TCATTATCTT GTTGGCAGCC GTCGCCGTTT TCGGCTTCAA 

51 ATCCTTTGTC GTCATCCCCC AGCAGGAAGT CCACGTTGTC GAAAGGCTCG 

101 GGCGTTTCCA TCGCGCCCTG ACGGCCGGTT TGAATATTTT GATTCCCTTT 

151 ATCGACCGCG TCGCCTACCG CCATTCGCTG AAAGAAATCC CTTTAGACGT 

201 ACCCAGCCAG GTCTGCATCA CGCGCGATAA TACGCAATTG ACTGTTGACG 

251 GCATCATCTA TTTCCAAGTA ACCGATCCCA AACTCGCCTC ATACGGTTCG 

301 AGCAACTACA TTATGGCAAT TACCCAGCTT GCCCAAACGA CGCTGCGTTC 

351 CGTTATCGGG CGTATGGAGT TGGACAAAAC GTTTGAAGAA CGCGACGAAA 

401 TCAACAGTAC CGTCGTCTCC GCCCTCGATG AAGCCGCCGG GGCTTGGGGT 

451 GTGAAAGTCC TCCGTTACGA AATCAAGGAT TTGGTTCCGC CGCAAGAAAT 

501 CCTTCGCGCA ATGCAGGCAC AAATTACCGC CGAACGCGAA AAACGCGCCC 

551 GTATTGCCGA ATCCGAAGGC CGTAAAATCG AACAAATCAA CCTTGCCAGT 

601 GGTCAGCGTG AAGCCGAAAT CCAACAATCC GAAGGCGAGG CTCAGGCTGC 

651 GGTCAATGCG TCCAATGCCG AGAAAATCGC CCGCATCAAC CGCGCCAAAG 

701 GCGAAGCGGA ATCCCTGCGC CTTGTTGCCG AAGCCAATGC CGAAGCCATC 

751 CGTCAAATTG CCGCCGCCCT TCAAACCCAA GGCGGGGCGG ATGCGGTCAA 

801 TCTGAAGATT GCGGAACAAT ACGTAGCCGC GTTCAACAAT CTTGCCAAAG 
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851 AAAGCAATAC GCTGATTATG CCCGCCAATG TTGCCGACAT CGGCAGCCTG 
901 ATTTCTGCCG GCATGAAAAT TATCGACAGC AGCAAAACCG CCAAATAA 

This corresponds to the amino acid sequence <SEQ ID 977; ORF 519-l,ng>: 

g519-l.pep 

1 MEFFIILLAA VAVFG FKSFV VIPQQEVHW ERLGRFHByiL TAGLNILIPF 

51 IDRVAYRHSL KEIPLDVPSQ VCITRDNTQL TVDGIIYFQV TDPKLASYG3 

101 SNYIMAITQL AQTTLRSVIG RMELDKTFEE RDEINSTWS ALDEAAGAWG 

151 VKVLRYEIKD LVPPQEILRA MQAQITAERE KRARIAESEG RKIEQINLAS 

201 GQREAEIQQS EGEAQAAVNA SNAEKIARIN RAKGEAESLR LVAEANAEAI 

251 RQIAAALQTQ GGADAVNLKI AEQYVAAFNN LAKESNTLIM PANVADIGSL 

301 ISAGMKIIDS SKTAK* 



m519-l/g519-l ORFs 519-1 and 519-1. r 



showed a 99.0% identity in 315 



g519-l.pep 
m519-l 



MEFFIILLAAVAVFGFKSFWIPQQEVHVVERLGRFHRALTAGLNILIPFIDRVAYRHSL 
I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
MEFFIILLVAVAVFGFKSFWIPQQEVHWERLGRFHRALTAGLNILIPFIDRVAYRHSL 



g519-l.pep 

m519-l , 



g519-l.pep 
m519-l 



g519-l.pep 
in519-l 



g519-l.pep 
m519-l 



70 



90 



100 



110 



120 



KEIPLDVPSQVCITRDNTQLTVDGIIYFQVTDPKLASYGSSNYIMAITQLAQTTLRSVIG 

Illllllll Illlllilllllllllllllllllllllllllllllll 

KEIPLDVPSQVCITRDNTQLTVDGIIYFQVTDPKLASYGSSNYIMAITQLAQTTLRSVIG 
70 80 90 100 110 120 



130 



140 



150 



160 



170 



RMELDKTFEERDEINSTWSALDEAAGAWGVKVLRYEIKDLVPPQEILRAMQAQITAERE 

II Illllllll IllllllllllllllllllllllillClillllllll 

RMELDKTFEERDEINSTWAALDEAAGAWGVKVLRYEIKDLVPPQEILRSMC2AQITAERE 
130 140 150 160 170 180 

190 200 210 220 230 240 

KRARIAESEGRKIEQINLASGQREAEIQQSEGEAQAAVNASNAEKIARINRAKGEAESLR 

I I I I 11 I I I I I I I I II II II I II I II I II II 11 I I I I I II I I I I I I I I 

KRARIAESEGRKIEQINLASGQREAEIQQSEGEAQAAVNASNAEKIARINRAKGEAESLR 

190 200 210 220 230 240 

250 260 270 280 290 300 

LVAEANAEAIRQIAAALQTQGGADAVNLKIAEQYVAAFNNLAKESNTLIMPANVADIGSL 



g519-l.pep 

m519-l 



ISAGMKIIDSSKTAKX 
310 



The following DNA sequence was identified in A': meningitidis <SEQ ID 978>: 

a519-l.seq 

1 ATGGAATTTT TCATTATCTT GCTGGCAGCC GTCGTTGTTT TCGGCTTCAA 

51 ATCCTTTGTT GTCATCCCAC AGCAGGAAGT CCACGTTGTC GAAAGGCTCG 

101 GGCGTTTCCA TCGCGCCCTG ACGGCCGGTT TGAATATTTT GATTCCCTTT 

151 ATCGACCGCG TCGCCTACCG CCATTCGCTG AAAGAAATCC CTTTAGACGT 

201 ACCCAGCCAG GTCTGCATCA CGCGCGACAA TACGCAGCTG ACTGTTGACG 

251 GTATCATCTA TTTCCAAGTA ACCGACCCCA AACTCGCCTC ATACGGTTCG 

301 AGCAACTACA TTATGGCGAT TACCCAGCTT GCCCAAACGA CGCTGCGTTC 
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351 CGTTATCGGG CGTATGGAAT TGGACAAAAC GTTTGAAGAA CGCGACGAAA 

4 01 TCAACAGCAC CGTCGTCTCC GCCCTCGATG AAGCCGCCGG AGCTTGGGGT 

451 GTGAAGGTTT TGCGTTATGA GATTAAAGAC TTGGTTCCGC CGCAAGAAAT 

501 CCTTCGCTCA ATGCAGGCGC AAATTACTGC TGAACGCGAA AAACGCGCCC 

551 GTATCGCCGA ATCCGAAGGT CGTAAAATCG AACAAATCAA CCTTGCCAGT 

601 GGTCAGCGCG AAGCCGAAAT CCAACAATCC GAAGGCGAGG CTCAGGCTGC 

651 GGTCAATGCG TCAAATGCCG AGAAAATCGC CCGCATCAAC CGCGCCAAAG 

701 GTGAAGCGGA ATCCTTGCGC CTTGTTGCCG AAGCCAATGC CGAAGCCATC 

751 CGTCAAATTG CCGCCGCCCT TCAAACCCAA GGCGGTGCGG ATGCGGTCAA 

801 TCTGAAGATT GCGGAACAAT ACGTCGCCGC GTTCAACAAT CTTGCCAAAG 

851 AAAGCAATAC GCTGATTATG CCCGCCAATG TTGCCGACAT CGGCAGCCTG 

901 ATTTCTGCCG GTATGAAAAT TATCGACAGC AGCAAAACCG CCAAATAA 

This corresponds to the amino acid sequence <SEQ ID 979; ORF 519-l.a>: 

a519-l.pep. 

1 MEFFIILLAR WVF6 FKSFV VIPQQEVHW ERLGRFHRAL TAGLNILIPF 

51 IDRVAYRHSL KEIPLDVPSQ VCITRDNTQL TVDGIIYFQV TDPKLASYGS 

101 SNYIMAITQL AQTTLRSVIG RMELDKTFEE RDEINSTWS ALDEAAGAWG 

151 VKVLRYEIKD LVPPQEILRS MQAQITAERE KRARIAESEG RKIEQINLAS 

201 GQREAEIQQS EGEAQAAVNA SNAEKIARIN RAKGEAESLR LVAEANAEAI 

251 RQIAAALQTQ GGADAVNLKI AEQYVAAFNN LAKESNTLIM PANVADIGSL 

301 ISAGMKIIDS SKTAK* 

m519-l/a519-l ORFs 519-1 and 519-1. a showed a 99.0% identity 

overlap 



10 20 30 40 50 50 

a519-l.pep MEFFIILLAAVWFGFKSFWIPQQEVHWERLGRFHRALTAGLNILIPFIDRVAYRHSL 
I I I I I I I I : I I : I M I I I I I I I ! I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I 
m519-l MEFFIILLVAVAVFGFKSFWIPQQEVHWERLGRFHRALTAGLNILIPFIDRVAYRHSL 

10 20 30 40 50 50 



70 80 90 100 110 120 

a519-l.pep KEIPLDVPSQVCITRDNTQLTVDGIIYFQVTDPKLASYGSSNYIMAITQLAQTTLRSVIG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

111519-1 KEIPLDVPSQVCITRDNTQLTVDGIIYFQVTDPKLASYGSSNYIMAITQLAQTTLRSVIG 

70 80 90 100 110 120 

130 140 150 160 170 180 

a519-l.pep HMELDKTFEERDEINSTVVSALDEAAGAWGVKVLRYEIKDLVPPQEILRSMQAQITAERE 

lllllllillllllllllh Illlllll I I 

in519-l RMELDKTFEERDEINSTWAALDEAAGAWGVKVLRYEIKDLVPPQEILRSMQAQITAERE 

130 140 150 160 170 180 

190 200 210 220 230 240 

a519-l .pep krariaesegrkieqinlasgqreaeiqqsegeaqaavnasnaekiarinrakgeaeslr 

mill I I iiiiiiiii 

m519-l krariaesegrkieqinlasgqreaeiqqsegeaqaavnasnaekiarinrakgeaeslr 
190 200 210 220 230 240 

250 260 270 280 290 300 

a519-l .pep lvaeanaeairqiaaalqtqggadavnlkiaeqyvaafnnlakesntlimpanvadigsl 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

m519-l lvaeanaeairqiaaalqtqggadavnlkiaeqyvaafnnlakesntlimpanvadigsl 

250 260 270 280 290 300 



310 

a519-l.pep ISAGMKIIDSSKTAKX 

m519-l ISAGMKIIDSSKTAKX 
310 
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576 and 576-1 gnm22.seq 

The following partial DNA sequence was identified in N. meningitidis <SEQ ID 980>: 



m576.seq. . 


(partial) 










1 


. . ATGCAGCAGG 


CAAGCTATGC 


GATGGGCGTG 


GACATCGGAC 


GCTCCCTGAA 


51 


GCAAATGAAG 


GAACAGGGCG 


CGGAAATCGA 


TTTGAAAGTC 


TTTACCGAAG 


101 


CCATGCAGGC 


AGTGTATGAC 


GGCAAAGAAA 


TCAAAATGAC 


CGAAGAGCAG 


151 


GCTCAGGAAG 


TCATGATGAA 


ATTCCTTCAG 


GAACAACAGG 


CTAAAGCCGT 


201 


AGAAAAACAC 


AAGGCGGACG 


CGAAGGCCAA 


TAAAGAAAAA 


GGCGAAGCCT 


251 


TTCTGAAAGA 


AAATGCCGCC 


AAAGACGGCG 


TGAAGACCAC 


TGCTTCCGGC 


301 


CTGCAATACA 


AAATCACCAA 


ACAGGGCGAA 


GGCAAACAGC 


CGACCAAAGA 


351 


CGACATCGTT 


ACCGTGGAAT 


ACGAAGGCCG 


CCTGATTGAC 


GGTACGGTAT 


401 


TCGACAGCAG 


CAAAGCCAAC 


GGCGGCCCGG 


TCACCTTCCC 


TTTGAGCCAA 


451 


GTGATTCCGG 


GTTGGACCGA 


AGgCGTACAG 


CTTCTGAAAG 


AAGGCGGCGA 


501 


AGCCflCGTTC 


TACATCCCGT 


CCAACCTTGC 


CTACCGCGAA 


CAGGGTGCGG 


551 


GCGACAAAAT 


CGGTCCGAAC 


GCCACTTTGG 


TATTTGATGT 


GAAACTGGTC 


501 


AAAATCGGCG 


CACCCGAAAA 


CGCGCCCGCC 


AAGCAGCCGG 


CTCAAGTCGA 


651 


CATCAAAAAA 


GTAAATTAA 









This corresponds to the amino acid sequence <SEQ ID 981; ORF 576>: 

m576.pep.. (partial) 

1 . .MQQASYAMGV DIGRSLKQMK EQGAEIDLKV FTEAMQAVYD GKEIKMTEEQ 
51 AQEVMMKFLQ EQQAKAVEKH KADAKANKEK GEAFLKENAA KDGVKTTASG 
101 LQYKITKQGE GKQPTKDDIV TVEYEGRLID GTVFDSSKAN GGPVTFPLSQ 
151 VIPGWTEGVQ LLKEGGEATF YIPSNLAYRE QGAGDKIGPN ATLVFDVKLV 
201 KIGAPENAPA KQPAQVDIKK VN* 

The following partial DNA sequence was identified in N. gonorrhoeae <SEQ ID 982>: 



g576.seq. 


. (partial) 










1 


. . atgggcgtgg 


acatcggacg 


ctccctgaaa 


caaatgaagg 


aacagggcgc 


51 


ggaaatcgat 


ttgaaagtct 


ttaccgatgc 


catgcaggca 


gtgtatgacg 


101 


gcaaagaaat 


caaaatgacc 


gaagagcagg 


cccaggaagt 


gatgatgaaa 


151 


ttcctgcagg 


agcagcaggc 


taaagccgta 


gaaaaacaca 


aggcggatgc 


201 


gaaggccaac 


aaagaaaaag 


gcgaagcctt 


cctgaaggaa 


aatgccgccg 


251 


aagacggcgt 




gcttccggtc 


tgcagtacaa 


aatcaccaaa 


301 


cagggtgaag 


gcaaacagcc 


gacaaaagac 


gacatcgtta 


ccgtggaata 


351 


cgaaggccgc 


ctgattgacg 


gtaccgtatt 


cgacagcagc 


aaagccaacg 


401 


gcggcccggc 


caccttccct 


ttgagccaag 


tgattccggg 


ttggaccgaa 


451 


ggcgtacggc 


ttctgaaaga 


aggcggcgaa 


gccacgttct 


acatcccgtc 


501 


caaccttgcc 


taccgcgaac 


agggtgcggg 


cgaaaaaatc 


ggtccgaacg 


551 


ccactttggt 


atttgacgtg 


aaactggtca 


aaatcggcgc 


acccgaaaac 


501 


gcgcccgcca 


agcagccgga 


tcaagtcgac 


atcaaaaaag 


taaattaa 



This corresponds to the amino acid sequence <SEQ ID 983; ORF 576.ng>: 

g576.pep. . (partial) 

1 ..MGVDIGRSLK QMKEQGAEID LKVFTDAMQA VYDGKEIKMT EEQAQEVMMK 
51 FLQEQQAKAV EKHKADAKAN KEKGEAFLKE NAAEDGVKTT ASGLQYKITK 
101 QGEGKQPTKD DIVTVEYEGR LIDGTVFDSS KANGGPATFP LSQVIPGWTE 
151 GVRLLKEGGE ATFYIPSNLA YREQGAGEKI GPNATLVFDV KLVKIGAPEN 
201 APAKQPDQVD IKKVN* 



Computer analysis of this amino acid sequence gave the following results: 
Homology with a predicted ORF from N. gonorrhoeae 



ni576/g576 97.2% identity in 215 aa overlap 
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MQQASYAMGVDIGRSLKQMKEQGAEIDLKVFTEAMQAVYDGKEIKMTEEQAQEVMMKFLQ 

MGVDIGRSLKQMKEQGAEIDLKVFTDAMQAVYDGKEIKMTEEQAQEVMMKFLQ 
10 20 30 40 50 

70 80 90 100 110 120 

EQQAKAVEKHKADAKANKEKGEAFLKENAAKDGVKTTASGLQYKITKQGEGKQPTKDDIV 

I I Illllllhllllllllllllllllll Mill 

EQQAKAVEKHKADAKANKEKGEAFLKENAAEDGVKTTASGLQYKITKQGEGKQPTKDDIV 
60 70 80 90 100 110 



QGAGDKIGPNATLVFDVKLVKIGAPENAPAKQPAQVDIKKVNX 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 984>: 

a576.seq 

1 ATGAACACCA TTTTCAAAAT CAGCGCACTG ACCCTTTCCG CCGCTTTGGC 

51 ACTTTCCGCC TGCGGCAAAA AAGAAGCCGC CCCCGCATCT GCATCCGAAC 

101 CTGCCGCCGC TTCTTCCGCG CAGGGCGACA CCTCTTCGAT CGGCAGCACG 

151 ATGCAGCAGG CAAGCTATGC GATGGGCGTG GACATCGGAC GCTCCCTGAA 

201 GCAAATGAAG GAACAGGGCG CGGAAATCGA TTTGAAAGTC TTTACCGAAG 

251 CCATGCAGGC AGTGTATGAC GGCAAAGAAA TCAAAATGAC CGAAGAGCAG 

301 GCTCAGGAAG TCATGATGAA ATTCCTTCAG GAACAACAGG CTAAAGCCGT 

351 AGAAAAACAC AAGGCGGACG CGAAGGCCAA TAAAGAAAAA GGCGAAGCCT 

401 TTCTGAAAGA AAATGCCGCC AAAGACGGCG TGAAGACCAC TGCTTCCGGC 

451 CTGCAATACA AAATCACCAA ACAGGGCGAA GGCAAACAGC CGACCAAAGA 

501 CGACATCGTT ACCGTGGAAT ACGAAGGCCG CCTGATTGAC GGTACGGTAT 

551 TCGACAGCAG CAAAGCCAAC GGCGGCCCGG TCACCTTCCC TTTGAGCCAA 

601 GTGATTCTGG GTTGGACCGA AGGCGTACAG CTTCTGAAAG AAGGCGGCGA 

651 AGCCACGTTC TACATCCCGT CCAACCTTGC CTACCGCGAA CAGGGTGCGG 

7 01 GCGACAAAAT CGGCCCGAAC GCCACTTTGG TATTTGATGT GAAACTGGTC 
751 AAAATCGGCG CACCCGAAAA CGCGCCCGCC AAGCAGCCGG CTCAAGTCGA 

8 01 CATCAAAAAA GTAAATTAA 

This corresponds to the amino acid sequence <SEQ ID 985; ORF 576.a>: 

a57 6 .pep 

1 MNTIFKISAL TLSAALALSA CGKKEAAPAS ASEPAAASSA QGDTSSIGST 

51 MQQASYAMGV DIGRSLKQMK EQGAEIDLKV FTEAMQAVYD GKEIKMTEEQ 

101 AQEVMMKFLQ EQQAKAVEKH KADAKANKEK GEAFLKENAA KDGVKTTASG 

151 LQYKITKQGE GKQPTKDDIV TVEYEGRLID GTVFDSSKAN GGPVTFPLSQ 

201 VILGWTEGVQ LLKEGGEATF YIPSNLAYRE QGAGDKIGPN ATLVFDVKLV 

251 KIGRPENAPA KQPAQVDIKK VN* 

m576/a576 ORFs 576 and 576. a showed a 99.5% identity in 222 aa overlap 

10 20 30 

m57 6 . pep MQQASYAMGVDIGRSLKQMKEQGAEIDLKV 

I I I I II I I II I II II II I I 

a57 6 CGKKEAAPASASEPAAASSAQGDTSSIGSTMQQASYAMGVDIGRSLKQMKEQGAEIDLKV 
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FTEAMQAVYDGKEIKMTEEQAQEVMMKFLQEQQAKAVEKHKADAKANKEKGEAFLKENAA 
FTEAMQAVYDGKEIKMTEEQAQEVMMKFLQEQQAKAVEKHJCADAKANKEKGEAFLKENAA 



KDGVKTTASGLQYKITKQGEGKQPTKDDIVTVEYEGRLIDGTVFDSSKANGGPVTFPLSQ 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I { I I I I I I I 



160 170 180 190 200 210 

VIPGWTEGVQLLKEGGEATFYIPSNLAYREQGAGDKIGPNATLVFDVKLVKIGAPENAPA 
II I II I I I I II II I I I I I M I I I I I I I I I I I I II II I I I II I II I I I I I I I I I ! I I I I I 
VILGWTEGVQLLKEGGEATFYIPSNLAYREQGAGDKIGPNATLVFDVKLVKIGAPENAPA 

210 220 230 240 250 260 

220 

KQPAQVDIKKVNX 



Further work revealed the following DNA sequence identified in N. meningitidis <SEQ ID 
986>: 

m576-l . seq 

1 ATGAACACCA TTTTCAAAAT CAGCGCACTG ACCCTTTCCG CCGCTTTGGC 

51 ACTTTCCGCC TGCGGCAAAA AAGAAGCCGC CCCCGCATCT GCATCCGAAC 

101 CTGCCGCCGC TTCTTCCGCG CAGGGCGACA CCTCTTCGAT CGGCAGCACG 

151 ATGCAGCAGG CAAGCTATGC GATGGGCGTG GACATCGGAC GCTCCCTGAA 

201 GCAAATGAAG GAACAGGGCG CGGAAATCGA TTTGAAAGTC TTTACCGAAG 

251 CCATGCAGGC AGTGTATGAC GGCAAAGAAA TCAAAATGAC CGAAGAGCAG 

301 GCTCAGGAAG TCATGATGAA ATTCCTTCAG GAACAACAGG CTAAAGCCGT 

351 AGAAAAACAC AAGGCGGACG CGAAGGCCAA TAAAGAAAAA GGCGAAGCCT 

401 TTCTGAAAGA AAATGCCGCC AAAGACGGCG TGAAGACCAC TGCTTCCGGC 

451 CTGCAATACA AAATCACCAA ACAGGGCGAA GGCAAACAGC CGACCAAAGA 

501 CGACATCGTT ACCGTGGAAT ACGAAGGCCG CCTGATTGAC GGTACGGTAT 

551 TCGACAGCAG CAAAGCCAAC GGCGGCCCGG TCACCTTCCC TTTGAGCCAfl 

601 GTGATTCCGG GTTGGACCGA AGGCGTACAG CTTCTGAAAG AAGGCGGCGA 

651 AGCCACGTTC TACATCCCGT CCAACCTTGC CTACCGCGAA CAGGGTGCGG 

701 GCGACAAAAT CGGTCCGAAC GCCACTTTGG TATTTGATGT GAAACTGGTC 

751 AAAATCGGCG CACCCGAAAA CGCGCCCGCC AAGCAGCCGG CTCAAGTCGA 

801 CATCAAAAAA GTAAATTAA 

This corresponds to the amino acid sequence <SEQ ID 987; ORF 576-1 >: 

m576-l.pep 

1 MNTIFKISAL TLSAALALSA CGKKEAAPAS ASEPAAASSA QGDTSSIGST 

51 MQQASYANGV DIGRSLKQMK EQGAEIDLKV FTEAMQAVYD GKEIKMTEEQ 

101 AQEVMMKFLQ EQQAKAVEKH KADAKANKEK GEAFLKENAA KDGVKTTASG 

151 LQYKITKQGE GKQPTKDDIV TVEYEGRLID GTVFDSSKAN GGPVTFPLSQ 

201 VIPGWTEGVQ LLKEGGEATF YIPSNLAYRE QGAGDKIGPN ATLVFDVKLV 

251 KIGAPENAPA KQPAQVDIKK VN* 

The following DNA sequence was identified in N. gonorrhoeae <SEQ ED 98 8>: 

g576-l.seq 

1 ATGAACACCA TTTTCAAAAT CAGCGCACTG ACCCTTTCCG CCGCTTTGGC 
51 ACTTTCCGCC TGCGGCAAAA AAGAAGCCGC CCCCGCATCT GCATCCGAAC 
101 CTGCCGCCGC TTCTGCCGCG CAGGGCGACA CCTCTTCAAT CGGCAGCACG 
151 ATGCAGCAGG CAAGCTATGC AATGGGCGTG GACATCGGAC GCTCCCTGAA 
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201 ACAAATGAAG GAACAGGGCG CGGAAATCGA 

251 CCATGCAGGC AGTGTATGAC GGCAAAGAAA 

301 GCCCAGGAAG TGATGATGAA ATTCCTGCAG 

351 AGAAAAACAC AAGGCGGATG CGAAGGCCAA 

401 TCCTGAAGGA AAATGCCGCC AAAGACGGCG 

451 CTGCAGTACA AAATCACCAA ACAGGGTGAA 

501 CGACATCGTT ACCGTGGAAT ACGAAGGCCG 

551 TCGACAGCAG CAAAGCCAAC GGCGGCCCGG 

601 GTGATTCCGG GTTGGACCGA AGGCGTACGG 

651 AGCCACGTTC TACATCCCGT CCAACCTTGC 

701 GCGAAAAAAT CGGTCCGAAC GCCACTTTGG 

751 AAAATCGGCG CACCCGAAAA CGCGCCCGCC 

801 CATCAAAAAA GTAAATTAA 

This corresponds to the amino acid sequence <SEQ ID 989; ORF 576-l.ng>: 

g576-l.pep 

1 MNTIFKISAL TLSAALALS A CGKKEAAPAS ASEPAAASAA QGDTSSIGST 

51 MQQASYAMGV DIGRSLKQMK EQGAEIDLKV FTDAMQAVYD GKEIKMTEEQ 

101 AQEVMMKFLQ EQQAKAVEKH KADAKANKEK GEAFLKENAA KDGVKTTASG 

151 LQYKITKQGE GKQPTKDDIV TVEYEGRLID GTVFDSSKAN GGPATFPLSQ 

201 VIPGWTEGVR LLKEGGEATF YIPSNLAYRE QGAGEKIGPN ATLVFDVKLV 

251 KIGAPENAPA KQPDQVDIKK VN* 



g576-l/m576-l ORFs 576-1 and 576-1. ng showed a 97.8% identity in 272 aa 
overlap 

10 20 30 40 50 60 

g57 6-l.pep MNTIFKISALTLSAALALSACGKKEAAPASASEPAAASAAQGDTSSIGSTMQQASYAMGV 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I i I I I I : I I I I I I I I I I I I I I I I I I I I I 
ni5 7 6 - 1 MNT I FKI S ALTLS AALALSACGKKEAAPAS ASE PAAAS SAQGDT S S I G STMQQAS YAMGV 

10 20 30 40 50 60 

70 80 90 100 110 120 

DIGRSLKQMKEQGAEIDLKVFTDAMQAVYDGKEIKMTEEQAQEVMMKFLQEQQAKAVEKH 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

DIGRSLKQMKEQGAEIDLKVFTEAMQAVYDGKEIKMTEEQAQEVMMKFLQEQQAKAVEKH 
70 80 90 100 110 120 

130 140 150 160 170 180 

KADAKANKEKGEAFLKENAAKDGVKTTASGLQYKITKQGEGKQPTKDDIVTVEYEGRLID 

KADAKANKEKGEAFLKENAAKDGVKTTASGLQYKITKQGEGKQPTKDDIVTVEYEGRLID 
130 140 150 160 170 180 

190 200 210 220 230 240 

GTVFDSSKANGGPATFPLSQVIPGWTEGVRLLKEGGEATFYIPSNLAYREQGAGEKIGPN 

GTVFDSSKANGGPVTFPLSQVIPGWTEGVQLLKEGGEATFYIPSNLAYREQGAGDKIGPN 
190 200 210 220 230 240 

250 260 270 

g57 6-1 . pep ATLVFDVKLVKIGAPENAPAKQPDQVDIKKVNX 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
m57 6-1 ATLVFDVKLVKIGAPENAPAKQPAQVDIKKVNX 
250 260 270 

The following DNA sequence was identified in AT. meningitidis <SEQ ID 990>: 

a576-l.seq 

1 ATGAACACCA TTTTCAAAAT CAGCGCACTG ACCCTTTCCG CCGCTTTGGC 
51 ACTTTCCGCC TGCGGCAAAA AAGAAGCCGC CCCCGCATCT GCATCCGAAC 
101 CTGCCGCCGC TTCTTCCGCG CAGGGCGACA CCTCTTCGAT CGGCAGCACG 



TTTGAAAGTC TTTACCGATG 
TCAAAATGAC CGAAGAGCAG 
GAGCAGCAGG CTAAAGCCGT 
CAAAGAAAAA GGCGAAGCCT 
TGAAGACCAC TGCTTCCGGT 
GGCAAACAGC CGACAAAAGA 
CCTGATTGAC GGTACCGTAT 
CCACCTTCCC TTTGAGCCAA 
CTTCTGAAAG AAGGCGGCGA 
CTACCGCGAA CAGGGTGCGG 
TATTTGACGT GAAACTGGTC 
AAGCAGCCGG ATCAAGTCGA 



g576-l.pep 
m576-l 



g576-l.pep 
m576-l 



g576-l,pep 
m576-l 
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151 ATGCAGCAGG CftAGCTATGC GATGGGCGTG GACATCGGAC GCTCCCTGAA 

201 GCAAATGAAG GAACAGGGCG CGGAAATCGA TTTGAAAGTC TTTACCGAAG 

251 CCATGCAGGC AGTGTATGAC GGCAAAGAAA TCAAAATGAC CGAAGAGCAG 

301 GCTCAGGAAG TCATGATGAA ATTCCTTCAG GAACAACAGG CTAAAGCCGT 

351 AGAAAAACAC AAGGCGGACG CGAAGGCCAA TAAAGAAAAA GGCGAAGCCT 

401 TTCTGT^iAGA AAATGCCGCC AAAGACGGCG TGAAGACCAC TGCTTCCGGC 

451 CTGCAATACA AAATCACCAA ACAGGGCGAA GGCAAACAGC CGACCAAAGA 

501 CGACATCGTT ACCGTGGAAT ACGAAGGCCG CCTGATTGAC GGTACGGTAT 

551 TCGACAGCAG CAAAGCCAAC GGCGGCCCGG TCACCTTCCC TTTGAGCCAA 

601 GTGATTCTGG GTTGGACCGA AGGCGTACAG CTTCTGAAAG AAGGCGGCGA 

651 AGCCACGTTC TACATCCCGT CCAACCTTGC CTACCGCGAA CAGGGTGCGG 

701 GCGACAAAAT CGGCCCGAAC GCCACTTTGG TATTTGATGT GAAACTGGTC 

751 AAAATCGGCG CACCCGAAAA CGCGCCCGCC AAGCAGCCGG CTCAAGTCGA 

801 CATCAAAAAA GTAAATTAA 

This corresponds to the amino acid sequence <SEQ ID 991 ; ORF 576-1 .a>: 

a576-l .pep 

1 MNTIFKISAL TLSAALALS A CGKKEAAPAS ASEPAAAS3A QGDT3SIGST 

51 MQQASYAMGV DIGRSLKQMK EQGAEIDLKV FTEAMQAVYD GKEIKMTEEQ 

101 AQEVMMKFLQ EQQAKAVEKH KADAKANKEK GEAFLKENAA KDGVKTTASG 

151 LQYKITKQGE GKQPTKDDIV TVEYEGRLID GTVFDSSKAN GGPVTFPLSQ 

201 VILGWTEGVQ LLKEGGEATF YIPSNLAYRE QGAGDKIGPN ATLVFDVKLV 

251 KIGAPENAPA KQPAQVDIKK VN* 

a576-l/m576-l ORFs 576-1 and 576-1. a 99.6% identity in 272 aa overlap 

10 20 30 40 50 60 

a576-l.pep mntifkisaltlsaalalsacgkkeaapasasepaaassaqgdtssigstmqqasyamgv 

m576-l mntifkisaltlsaalalsacgkkeaapasasepaaassaqgdtssigstmqqasyamgv 
10 20 30 40 50 60 

70 80 90 100 110 120 

a576-l.pep DIGRSLKQMKEQGAEIDLKVFTEAMQAVYDGKEIKMTEEQAQEVMMKFLQEQQAKAVEKH 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
m576-l DIGRSLKQMKEQGAEIDLKVFTEAMQAVYDGKEIKMTEEQAQEVMMKFLQEQQAKAVEKH 
70 80 90 100 110 120 

130 140 150 160 170 180 

a576-l.pep ECADAKANKEKGEAFLKENAAKDGVKTTASGLQYKITKQGEGKQPTKDDIVTVEYEGRLID 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

m576-l KADAKANKEKGEAFLKENAAKDGVKTTASGLQYKITKQGEGKQPTKDDIVTVEYEGRLID 

130 140 150 160 170 180 

190 200 210 220 230 240 

a57 6-1 . pep GTVFDSSKANGGPVTFPLSQVILGWTEGVQLLKEGGEATFYIPSNLAYREQGAGDKIGPN 

m576-l GTVFDSSKANGGPVTFPLSQVIPGWTEGVQLLKEGGEATFYIPSNLAYREQGAGDKIGPN 
190 200 210 220 230 240 

250 260 270 

a57 6-1 . pep ATLVFDVKLVKIGAPENAPAKQPAQVDIKKVNX 

m5 7 6- 1 ATLVFDVKLVKIGAPENAPAKQPAQVDIKKVNX 
250 260 270 



919 and 919-2 



gnm43.seq 
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The following partial DNA sequence was identified in N. meningitidis <SEQ ID 992>: 



]n919.seq 












1 


ATGAAAAAAT 


ACCTATTCCG 


CGCCGCCCTG 


TACGGCATCG 


CCGCCGCCAT 


51 


CCTCGCCGCC 


TGCCAAAGCA 


AGAGCATCCA 


AACCTTTCCG 


CAACCCGACA 


101 


CATCCGTCAT 


CAACGGCCCG 


GACCGGCCGG 


TCGGCATCCC 


CGACCCCGCC 


151 


GGAACGACGG 


TCGGCGGCGG 


CGGGGCCGTC 


TATACCGTTG 


TACCGCACCT 


201 


GTCCCTGCCC 


CACTGGGCGG 


CGCAGGATTT 


CGCCAAAAGC 


CTGCAATCCT 


251 


TCCGCCTCGG 


CTGCGCCAAT 


TTGAAAAACC 


GCCAAGGCTG 


GCAGGATGTG 


301 


TGCGCCCAA6 


CCTTTCAAAC 


CCCCGTCCAT 


TCCTTTCAGG 


CAAAACAGTT 


351 


TTTTGAACGC 


TATTTCACGC 


CGTGGCAGGT 


TGCAGGCAAC 


GGAAGCCTTG 


401 


CCGGTACGGT 


TACCGGCTAT 


TACGAACCGG 


TGCTGAAGGG 


CGACGACAGG 


451 


CGGACGGCAC 


AAGCCCGCTT 


CCCGATTTAC 


GGTATTCCCG 


ACGATTTTAT 


501 


CTCCGTCCCC 


CTGCCTGCCG 


GTTTGCGGAG 


CGGAAAAGCC 


CTTGTCCGCA 


551 


TCAGGCAGAC 


GGGAAAAAAC 


AGCGGCACAA 


TCGACAATAC 


CGGCGGCACA 


601 


CATACCGCCG 


ACCTCTCCcG 


ATTCCCCATC 


ACCGCGCGCA 


CAACAGCAAT 


651 


CAAAGGCAGG 


TTTGAAGGAA 


GCCGCTTCCT 


CCCCTACCAC 


ACGCGCAACC 


701 


AAATCAACX3G 


CGGCGCGCTT 


GACGGCAAAG 


CCCCGATACT 


CGGTTACGCC 


751 


GAAGACCCTG 


TCGAACTTTT 


TTTTATGCAC 


ATCCAAGGCT 


CGGGCCGTCT 


801 


GAAAACCCCG 


TCCGGCAAAT 


ACATCC6CAT 


CGGCTATGCC 


GACAAAAACG 


851 


AACATCCyrA 


CGTTTCCATC 


GGACGCTATA 


TGGCGGATAA 


GGGCTACCTC 


901 


AAACTCGGAC 


AAACCTCCAT 


GCAGGGCATT 


AAGTCTTATA 


TGCGGCAAAA 


951 


TCCGCAACGC 


CTCGCCGAAG 


TTTTGGGTCA 


AAACCCCAGC 


TATATCTTTT 


1001 


TCCGCGAGCT 


TGCCGGAAGC 


AGCAATGACG 


GCCCTGTCGG 


CGCACTGGGC 


1051 


ACGCCGCTGA 


TGGGGGAATA 


TGCCX3GCGCA 


GTCGACCGGC 


ACTACATTAC 


1101 


CTTGGGTGCG 


CCCTTATTTG 


TCGCCACCGC 


CCATCCGGTT 


ACCCGCAAAG 


1151 


CCCTCAACCG 


CCTGATTATG 


GCGCAGGATA 


CCGGCAGCGC 


GATTAAAGGC 


1201 


GCGGTGCGCG 


TGGATTATTT 


TTGGGGATAC 


GGCGACGAAG 


CCGGCGAACT 


1251 


TGCCGGCAAA 


CAGAAAACCA 


CGGGATATGT 


CTGGCAGCTC 


CTACCCAACG 


1301 


GTATGAAGCC 


CGAATACCGc 


CCGTAA 







This corresponds to the amino acid sequence <SEQ ID 993; ORF 919>: 

10919. pep 

1 MKKYLFRAAL YGIAAAILAA CQSKSIQTFP QPDTSVINGP DRPVGIPDPA 

51 GTTVGGGGAV YTWPHLSLP HWAAQDFAKS LQSFRLGCAN LKNRQGWQDV 

101 CAQAFQTPVH SFQAKQFFER YPTPWQVAGN GSLAGTVTGY YEPVLKGDDR 

151 RTAQARFPIY GIPDDFISVP LPAGLRSGKA LVRIRQTGKN SGTIDNTGGT 

201 HTADLSRFPI TARTTAIKGR FEGSRFLPYH TRNQINGGAL DGKAPILGYA 

251 EDPVELPFMH IQGSGRLKTP SGKYIRIGYA DKNEHPYVSI GRYMADKGYL 

301 KLGQTSMQGl KSYMRQNPQR LAEVLGQNPS YIFFRELAGS SNDGPVGALG 

351 TPLMGEYAGA VDRHYITLGA PLFVATAHPV TRKALNRLIM AQDTGSAIKG 

401 AVRVDYPWGY GDEAGELAGK QKTTGYVWQL LPNGMKPEYR P* 



The following partial DNA sequence was identified in N.meningitidis <SEQ ID 994>: 

ingi9-2.seq 

1 ATGAAAAAAT ACCTATTCCG CGCCGCCCTG TACGGCATCG CCGCCGCCAT 

51 CCTCGCCGCC TGCCAAAGCA AGAGCATCCA AACCTTTCCG CAACCCGACA 

101 CATCCGTCAT CAACGGCCCG GACCGGCCGG TCGGCATCCC CGACCCCGCC 

151 GGAACGACGG TCGGCGGCGG CGGGGCCGTC TATACCGTTG TACCGCACCT 

201 GTCCCTGCCC CACTGGGCGG CGCAGGATTT CGCCAAAAGC CTGCAATCCT 

251 TCCGCCTCGG CTGCGCCAAT TTGAAAAACC GCCAAGGCTG GCAGGATGTG 

301 TGCGCCCAAG CCTTTCAAAC CCCCGTCCAT TCCTTTCAGG CAAAACAGTT 

351 TTTTGAACGC TATTTCACGC CGTGGCAGGT TGCAGGCAAC GGAAGCCTTG 

4 01 CCGGTACGGT TACCGGCTAT TACGAACCGG TGCTGAAGGG CGACGACAGG 

4 51 CGGACGGCAC AAGCCCGCTT CCCGATTTAC GGTATTCCCG ACGATTTTAT 

501 CTCCGTCCCC CTGCCTGCCG GTTTGCGGAG CGGAAAAGCC CTTGTCCGCA 

551 TCAGGCAGAC GGGAAAAAAC AGCGGCACAA TCGACAATAC CGGCGGCACA 
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601 CATACCGCCG ACCTCTCCCG ATTCCCCATC ACCGCGCGCA CAACAGCAAT 

651 CAAAGGCAGG TTTGAAGGAA GCCGCTTCCT CCCCTACCAC ACGCGCAACC 

701 AAATCAACGG CGGCGCGCTT GACGGCAAAG CCCCGATACT CGGTTACGCC 

751 GAAGACCCTG TCGAACTTTT TTTTATGCAC ATCCAAGGCT CGGGCCGTCT 

801 GAAAACCCCG TCCGGCAAAT ACATCCGCAT CGGCTATGCC GACAAAAACG 

851 AACATCCCTA CGTTTCCATC GGACGCTATA TGGCGGATAA GGGCTACCTC 

901 AAACTCGGAC AAACCTCCAT GCAGGGCATT AAGTCTTATA TGCGGCAAAA 

951 TCCGCAACGC CTCGCCGAAG TTTTGGGTCA AAACCCCAGC TATATCTTTT 

1001 TCCGCGAGCT TGCCGGAAGC AGCAATGACG GCCCTGTCGG CGCACTGGGC 

1051 ACGCCGCTGA TGGGGGAATA TGCCGGCGCA GTCGACCGGC ACTACATTAC 

.1101 CTTGGGTGCG CCCTTATTTG TCGCCACCGC CCATCCGGTT ACCCGCAAAG 

1151 CCCTCAACCG CCTGATTATG GCGCAGGATA CCGGCAGCGC GATTAAAGGC 

1201 GCGGTGCGCG TGGATTATTT TTGGGGATAC GGCGACGAAG CCGGCGAACT 

1251 TGCCGGCAAA CAGAAAACCA CGGGATATGT CTGGCAGCTC CTACCCAACG 

1301 GTATGAAGCC CGAATACCGC GCGTAA 

This corresponds to the amino acid sequence <SEQ ID 995; ORF 919-2>: 

m919-2.pep 

1 MKKYLFRAAL YGIAAAILAA CQSKSIQTFP QPDTSVINGP DRPVGIPDPA 

51 GTTVGGGGAV YTWPHLSLP HWAAQDFAKS LQSFRLGCAN LKNRQGWQDV 

101 CAQAFQTPVH SFQAKQFFER YFTPWQVAGN GSLAGTVTGY YEPVLKGDDR 

151 RTAQARFPIY GIPDDFISVP LPAGLRSGKA LVRIRQTGKN SGTIDNTGGT 

201 HTADLSRFPI TARTTAIKGR FEGSRFLPYH TRNQINGGAL DGKAPILGYA 

251 EDPVELFFMH IQGSGRLKTP SGKYIRIGYA DKNEHPYVSI GRYMADKGYL 

301 KLGQTSMQGI KSYMRQNPQR LAEVLGQNPS YIFFRELAGS SNDGPVGALG 

351 TPLMGEYAGA VDRHYITLGA PLFVATAHPV TRKALNRLIM AQDTGSAIKG 

401 AVRVDYFWGY GDEAGELAGK QKTTGYVWQL LPNGMKPEYR P* 



The following partial DNA sequence was 


g919.sec[ 






1 


ATGAAAAAAC 


ACCTGCTCCG 


51 


CctcgCCGCC 


TGCCAAAgca 


101 


CATCCGTCAT 


CAACGGCCCG 


151 


GGAACGACGG 


TTGCCGGCGG 


201 


GTCCATGCCC 


CACTGGGCSG 


251 


TCCGCCTCGG 


CTGCGCCAAT 


301 


TGCGCCCAAG 


CCTTTCAAAC 


351 


TTTTGAACGC 


TATTTCACGC 


401 


Caggtacggt 


TACCGGCTAT 


451 


CGGACGGAAC 


GGGCCCGCTT 


501 


CTCCGTCCCG 


CTGCCTGCCG 


551 


TCAGGCAGac 


ggGGAAAAAC 


601 


CATACCGCCG 


ACCTCTCCCG 


651 


caaaGGCAGG 


TTTGAaggAA 


701 


AAAtcaacGG 


CGGCgcgcTT 


751 


GAagaccCcG 


tcgaacttTT 


801 


GAAAACCCcg 


tccggcaaat 


851 


AACAtccgTa 


tgtttccatc 


901 


AAGctcgggc 


agACCTCGAT 


951 


TCCGCAACGC 


CTCGCCGAAG 


1001 


TCCGCGAGCT 


TGCCGGAAGC 


1051 


ACGCCACTGA 


TGGGGGAATA 


1101 


CTTGGGCGCG 


CCCTTATTTG 


1151 


CCCTCAACCG 


CCTGATTATG. 


1201 


GCGGTGCGCG 


TGGATTATTT 


1251 


TGCCGGCAAA 


CAGAAAACCA 


1301 


GCATGAAGCC 


CGAATACCGC 



identified in N.gonorrhoeae <SEQ ID 996>: 

CTCCGCCCTG TACGGcatCG CCGCCgccAT 
gGAGCATCCA AACCTTTCCG CAACCCGACA 
GACCGGCCGG CCGGCATCCC CGACCCCGCC 
CGGGGCCGTC TATACC6TTG TGCCGCACCT 
CGCaggATTT TGCCAAAAGC CTGCAATCCT 
TTGAAAAACC GCCAAGGCTG GCAGGATGTG 
CCCCGTGCAT TCCTTTCAGG CAAAGcGgTT 
cgtGGCaggt tgcaggcaAC GGAAGcCTTG 
TACGAACCGG TGCTGAAGGG CGACGGCAGG 
CCCGATTTAC GGTATTCCCG ACGATTTTAT 
GTTTGCGGGG CGGAAAAAAC CTTGTCCGCA 
AGCGGCACGA TCGACAATGC CGGCGGCACG 
ATTCCCCATC ACCGCGCGCA CAACGGcaat 
GCCGCTTCCT CCCTTACCAC ACGCGCAACC 
GACGGCAAag cccCCATCCT CggttacgcC 
TTTCATGCAC AtccaaggCT CGGGCCGCCT 
acatCCGCAt cggaTacgcc gacAAAAACG 
ggACGctaTA TGGCGGACAA AGGCTACCTC 
GCAGGgcatc aaagcCTATA TGCGGCAAAA 

TTTTGGGTCA AAACCCCAGC TATATCTTTT 
GGCAATGAGG GCCCCGTCGG CGCACTGGGC 
CGCCGGCGCA ATCGACCGGC ACTACATTAC 
TCGCCACCGC CCATCCGGTT ACCCGCAAAG 
GCGCAGGATA CAGGCAGCGC GATCAAAGGC 
TTGGGGTTAC GGCGACGAAG CCGGCGAACT 
CGGGATACGT CTGGCAGCTC CTGCCCAACG 
CCGTGA 
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This corresponds to the amino acid sequence <SEQ ID 997; ORF 919.ng>: 



g919.pep 












■ 1 


MKKHLLRSAL 


YGIAAAILAA 


CQSRSIQTFP 


QPDTSVINGP 


DRPAGIPDPA 


51 


GTTVAGGGAV 


YTWPHLSMP 


HWAAQDFAKS 


LQSFRLGCAN 


LKNRQGWQDV 


101 


CAQAFQTPVH 


SFQAKRFFER 


YFTPWQVAGN 


GSLAGTVTGY 


YEPVLKGDGR 


151 


RTERARFPIY 


GIPDDFISVP 


LPAGLRGGKN 


LVRIRQTGKN 


SGTIDNAGGT 


201 


HTADLSRFPI 


TARTTAIKGR 


FEGSRFLPYH 


TRNQINGGAL 


DGKAPILGYA 


251 


EDPVELFFMH 


IQGSGRLKTP 


SGKYIRIGYA 


DKNEHPYVSI 


GRYMADKGYL 


301 


KLGQTSMQGI 


KAYMRQNPQR 


LAEVLGQNPS 


YIFFRELAGS 


GNEGPVGALG 


351 


TPLMGEYAGA 


IDRHYITLGA 


PLFVATAHPV 


TRKALNRLIM 


AQDTGSAIKG 


401 


AVRVDYFWGY 


GDEAGEIiAGK 


QKTTGYVWQL 


LPNGMKPEYR 


P* 



ORF 919 shows 95.9 % identity over a 441 aa overlap with a predicted ORF (ORF 91 9.ng) 
from N. gonorrhoeae: 

m919/g919 



MKKYLFRAALYGIAAAILAACQSKSIQTFPQPDTSVINGPDRPVGIPDPAGTTVGGGGAV 

llhhl = lllllllllllllihlMIIIIIIIIIIIIIII|:||||lllll|:||lll 

MKKHLLRSALYGIAAAILAACQSRSIQTFPQPDTSVINGPDRPAGIPDPAGTTVAGGGAV 



90 



100 



110 



120 



YTWPHLSLPHWAAQDFAKSLQSFRLGCANLKNRQGWQDVCAQAFQTPVHSFQAKQFFER 
IIIIIIMHIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIhllll 
YTWPHLSMPHWAAQDFAKSLQSPRLGCANLKNRQGWQDVCAQAFQTPVHSFQAKRFFER 
70 80 90 100 110 120 



130 140 150 160 170 180 

YFTPWQVAGNGSLAGTVTGYYEPVLKGDDRRTAQARFPIYGIPDDFISVPLPAGLRSGKA 
llllllllllllllllllllllllllll III :|||IIIIIIMIIIIIIIIIIhll 
YFTPWQVAGNGSLAGTVTGYYEPVLRGDGRRTERARFPIYGIPDDFISVPLPAGLRGGKN 

130 140 150 160 170 180 

190 200 210 220 230 240 

LVRIRQTGKNSGTIDNTGGTHTADLSRFPITARTTAIKGRFEGSRFLPYHTRNQINGGAL 

IIIIMIIIIIIIIIhlUMMIIIIIIIIIIIIIIIIMIIIMIIMIIIIIMII 

LVRIRQTGKNSGTIDNAGGTHTADLSRFPITARTTAIKGRFEGSRFLPYHTRNQINGGAL 
190 200 210 220 230 240 

250 260 270 280 290 300 

DGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRIGYADKNEHPYVSIGRYMADKGYL 

IIIIIIIIIIIMIIIIIIMIIIIIMIIIIIIIIMMIIIIIIIIilllllllllll 

DGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRIGYADKNEHPYVSIGRYMADKGYL 
250 260 270 280 290 300 



310 320 330 340 350 360 

KLGQTSMQGIKSYMRQNPQRLAEVLGQNPSYIFFRELA6SSNDGPVGALGTPLMGEYAGA 
IIIIIIMIIhlllllllllllllllllllllllllll|:|:||lllllllllllllll 
KLGQTSMQGI KAYMRQNPQRLAEVLGQNPS YI FFRELAGSGNEGPVGALGTPLMGEYAGA 

310 320 330 340 350 360 



370 380 390 400 410 420 

55 m919.pep VDRHYITLGAPLFVATAHPVTRKALNRLIMAQDTGSAIKGAVRVDYFWGYGDEAGELAGK 

HIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIII 

g919 IDRHYITLGAPLFVATAHPVTRKALNRLIMAQDTGSAIKGAVRVDYFWGYGDEAGELAGK 
370 380 390 400 410 420 
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430 440 
QKTTGYVWQLLPNGMKPEYRPX 
IIIIIMIIIIIMIMIIIII 
QKTTGYVWQLLPNGMKPEYRPX 

430 440 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 998>: 

a919.seq 

1 ATGAAAAAAT ACCTATTCCG CGCCGCCCTG TGCGGCATCG CCGCCGCCAT 

51 CCTCGCCGCC TGCCAAAGCA AGAGCATCCA AACCTTTCCG CAACCCGACA 

101 CATCCGTCAT CAACGGCCCG GACCGGCCGG TCGGCATCCC CGACCCCGCC 

151 GGAACGACGG TCGGCGGCGG CGGGGCCGTT TATACCGTTG TGCCGCACCT 

201 GTCCCTGCCC CACTGGGCGG CGCAGGATTT CGCCAAAAGC CTGCAATCCT 

251 TCCGCCTCGG CTGCGCCAAT TTGAAAAACC GCCAAGGCTG GCAGGATGTG 

301 TGCGCCCAAG CCTTTCAAAC CCCCGTCCAT TCCGTTCAGG CAAAACAGTT 

351 TTTTGAACGC TATTTCACGC CGTGGCAGGT TGCAGGCAAC GGAAGCCTTG 

401 CCGGTACGGT TACCGGCTAT TACGAGCCGG TGCTGAAGGG CGACGACAGG 

451 CGGACGGCAC AAGCCCGCTT CCCGATTTAC GGTATTCCCG ACGATTTTAT 

501 CTCCGTCCCC CTGCCTGCCG GTTTGCGGAG CGGAAAAGCC CTTGTCCGCA 

551 TCAGGCAGAC GGGAAAAAAC AGCGGCACAA TCGACAATAC CGGCGGCACA 

601 CATACCGCCG ACCTCTCCCA ATTCCCCATC ACTGCGCGCA CAACGGCAAT 

651 CAAAGGCAGG TTTGAAGGAA GCCGCTTCCT CCCCTACCAC ACGCGCAACC 

701 AAATCAACGG CGGCGCGCTT GACGGCAAAG CCCCGATACT CGGTTACGCC 

751 GAAGACCCCG TCGAACTTTT TTTTATGCAC ATCCAAGGCT CGGGCCGTCT 

801 GAAAACCCCG TCCGGCAAAT ACATCCGCAT CGGCTATGCC GACAAAAACG 

851 AACATCCCTA CGTTTCCATC GGACGCTATA TGGCGGACAA AGGCTACCTC 

901 AAGCTCGGGC AGACCTCGAT GCAGGGCATC AAAGCCTATA TGCAGCAAAA 

951 CCCGCAACGC CTCGCCGAAG TTTTGGGGCA AAACCCCAGC TATATCTTTT 

1001 TCCGAGAGCT TACCGGAAGC AGCAATGACG GCCCTGTCGG CGCACTGGGC 

1051 ACGCCGCTGA TGGGCGAGTA CGCCGGCGCA GTCGACCGGC ACTACATTAC 

1101 CTTGGGCGCG CCCTTATTTG TCGCCACCGC CCATCCGGTT ACCCGCAAAG 

1151 CCCTCAACCG CCTGATTATG GCGCAGGATA CCGGCAGCGC GATTAAAGGC 

1201 GCGGTGCGCG TGGATTATTT TTGGGGATAC GGCGACGAAG CCGGCGAACT 

1251 TGCCGGCAAA CAGAAAACCA CGGGATATGT CTGGCAGCTT CTGCCCAACG 

1301 GTATGAAGCC CGAATACCGC CCGTAA 

This corresponds to the amino acid sequence <SEQ ID 999; ORF 919.a>: 

a919.pep 

1 MKKYLFRAAL CGIAAAILAA CQSKSIQTFP QPDTSVINGP DRPVGIPDPA 

51 GTTVGGGGAV YTWPHLSLP HWAAQDFAKS LQSFRLGCAN LKNRQGWQDV 

101 CAQAFQTPVH SVQAKQFFER YFTPWQVAGN GSLAGTVTGY YEPVLKGDDR 

151 RTAQARFPIY GIPDDFISVP LPAGLRSGKA LVRIRQTGKN SGTIDNTGGT 

201 HTADLSQFPI TARTTAIKGR FEGSRFLPYH TRNQINGGAL DGKAPILGYA 

251 EDPVELFFMH IQGSGRLKTP SGKYIRIGYA DKNEHPYVSI GRYMADKGYL 

301 KLGQTSMQGI KAYMQQNPQR LAEVLGQNPS YIFFRELTGS SNDGPVGALG 

351 TPLMGEYAGA VDRHYITLGA PLFVATAHPV TRKALNRLIM AQDTGSAIKG 

401 AVRVDYFWGY GDEAGELAGK QKTTGYVWQL LPNGMKPEYR P* 

m919/a919 ORFs 919 and 919.a showed a 98.6% identity in 441 aa overlap 

10 20 30 40 50 60 

m919.pep MKKYLFRAALYGIAAAILAACQSKSIQTFPQPDTSVINGPDRPVGIPDPAGTTVGGGGAV 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

a919 MKKYLFRAALCGIAAAILAACQSKSIQTFPQPDTSVINGPDRPVGIPDPAGTTVGGGGAV 

10 20 30 40 50 60 

70 80 90 IOC 110 120 

m919 . pep YTWPHLSLPHWAAQDFAKSLQSFRLGCANLKNRQGWQDVCAQAFQTPVHSFQAKQFFER 

I I Illlllllllllllllll Ill llllllll 

a919 YTWPHLSLPHWAAQDFAKSLQSFRLGCANLKNRQGWQDVCAQAFQTPVHSVQAKQFFER 
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YFTPWQVAGNGSLAGTVTGYYEPVLKGDDRRTAQARFPIYGIPDDFISVPLPAGLRSGKA 



190 200 210 220 230 240 

LVRIRQTGKNSGTIDNTGGTHTADLSRFPITARTTAIKGRFEGSRFLPYHTRNQINGGAL 

llllllllllllllllllllllllihIilMllllllllllllllllllllllllllll 

LVRIRQTGKNSGTIDNTGGTHTADLSQFPITARTTAIKGRFEGSRFLPYHTRNQINGGAL 
190 200 210 220 230 240 

250 260 270 280 290 300 

DGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRIGYADKNEHPYVSIGRYMADKGYL 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
DGKAPILGYAEDPVELFFMHIQGSGRLKTPSGKYIRIGYADKNEHPYVSIGRYMADKGYL 

250 260 270 280 290 300 



370 380 390 400 410 420 

VDRHYITLGAPLFVATAHPVTRKALNRLIMAQDTGSAIKGAVRVDYPWGYGDEAGELAGK 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

VDRHYITLGAPLFVATAHPVTRKALNRLIMAQDTGSAIKGAVRVDYFWGYGDEAGELAGK 

370 380 390 400 410 420 



430 



440 



QKTTGYVWQLLPNGMKPEYRPX 
QKTTGYVWQLLPNGMKPEYRPX 



121 and 121-1 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 1000>: 

1 ATGGAAACAC AGCTTTACAT CGGCATCATG TCGGGAACCA GCATGGACGG 

51 GGCGGATGCC GTACTGATAC GGATGGACGG CGGCAAATGG CTGGGCGCGG 

101 AAGGGCACGC CTTTACCCCC TACCCCGGCA GGTTACGCCG CCAATTGCTG 

151 GATTTGCAGG ACACAGGCGC AGACGAACTG CACCGCAGCA GGATTTTGTC 

201 GCAAGAACTC AGCCGCCTAT ATGCGCAAAC CGCCGCCGAA CTGCTGTGCA 

251 GTCAAAACCT CGCACCGTCC GACATTACCG CCCTCGGCTG CCACGGGCAA 

301 ACCGTCCGAC ACGCGCCGGA ACACGGTTAC AGCATACAGC TTGCCGATTT 

351 GCCGCTGCTG GCGxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx 

4 01 xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx 

451 xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxxxxx 



: xxxxxxxxxx 

xxxxxxCAGC TTCCTTACGA CAAAAACGGT GCAAAGTCGG CACAAGGCAA 
CATATTGCCG CAACTGCTCG ACAGGCTGCT CGCCCACCCG TATTTCGCAC 
AACGCCACCC TAAAAGCACG GGGCGCGAAC TGTTTGCCAT AAATTGGCTC 
GAAACCTACC TTGACGGCGG CGAAAACCGA TACGACGTAT TGCGGACGCT 
TTCCCGTTTT ACCGCGCAAA CCGTTTGCGA CGCCGTCTCA CACGCAGCGG 
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8 51 CAGATGCCCG TCAAATGTAC ATTTGCGACG GCGGCATCCG CAATCCTGTT 
901 TTAATGGCGG ATTTGGCAGA ATGTTTCGGC ACACGCGTTT CCCTGCACAG 
951 CACCGCCGAC CTGAACCTCG ATCCGCAATG GGTGGAAGCC GCCGnATTTG 
1001 CGTGGTTGGC GGCGTGTTGG ATTAATCGCA TTCCCGGTAG TCCGCACAAA 
1051 GCAACCGGCG CATCCAAACC GTGTATTCTG AnCGCGGGAT ATTATTATTG 
1101 A 

This corresponds to the amino acid sequence <SEQ ID 1001; ORF 121>: 

111121. pep 

1 METQLYIGIM SGTSMDGADA VLIRMDGGKW LGAEGHAFTP YPGRLRRQLL 

51 DLQDTGADEL HRSRILSQEL SRLYAQTAAE LLCSQNLAPS DITALGCHGQ 

101 TVRHAPEHGY SIQLADLPLL Axxxxxxxxx xxxxxxxxxx xxxxxxxxxx 

201 xxQLPYDKNG AKSAQGNILP QLLDRLLAHP YFAQRHPKST GRELFAINWL 

251 ETYLDGGENR YDVLRTLSRF TAQTVCDAVS HAAADARQMY ICDGGIRNPV 

301 LMADLAECFG TRVSLHSTAD LNLDPQWVEA AXFAWLAACW INRIPGSPHK 

351 ATGASKPCIL XAGYYY* 

The following partial DNA sequence was identified in N. gonorrhoeae <SEQ ID 1002>: 

gl21.seq 



1 


ATGGAAACAC 


AGCTTTACAT 


CGGCATTATG 


TCGGGAACCA 


GTATGGACGG 


51 


GGCGGATGCC 


GTGCTGGTAC 


GGATGGACGG 


CGGCAAATGG 


CTGGGCGCGG 


101 


AAGGGCACGC 


CTTTACCCCC 


TACCCTGACC 


GGTTGCGCCG 


CAAATTGCTG 


151 


GATTTGCAGG 


ACACAGGCAC 


AGACGAACTG 


CACCGCAGCA 


GGATGTTGTC 


201 


GCAAGAACTC 


AGCCGCCTGT 


ACGCGCAAAC 


CGCCGCCGAA 


CTGCTGTGCA 


251 


GTCAAAACCT 


CGCTCCGTGC 


GACATTACCG 


CCCTCGGCTG 


CCACGGGCAA 


301- 


ACCGTCCGAC 


ACGCGCCGGA 


ACACGGTtac 


AGCATACAGC 


TTGCCGATTT 


351 


GCCGCTGCTG 


GCGGAACTGa 


cgcggatttT 


TACCGTCggc 


gacttcCGCA 


401 


GCCGCGACCT 


TGCTGCCGGC 


GGacaAGGTG 


CGCCGCTCGT 


CCCCGCCTTT 


451 


CACGAAGCCC 


TGTTCCGCGA 


TGACAGGGAA 


ACACGCGTGG 


TACTGAACAT 


501 


CGGCGGGATT 


GCCAACATCA 


GCGTACTCCC 




CCCGCCTTCG 


551 


GCTTCGACAC 


AGGGCCGGGC 


AATATGCTGA 


TGGAcgcgtg 


gacgcaggca 


601 


cacTGGcagc 


TGCCTTACGA 


CAAAAacggt 


gcAAAGgcgg 


cacAAGGCAA 


651 


catatTGCcg 


cAACTGCTCG 


gcaggctGCT 


CGCCcaccCG 


TATTTCTCAC 


701 


AACCCcaccc 


aaAAAGCACG 


GGgcGCGaac 


TgtttgcccT 


AAattggctc 


751 


gaaacctAcc 


ttgacggcgg 


cgaaaaccga 


tacgacgtat 


tgcggacgct 


801 


ttcccgattc 


accgcgcaaA 


ccgTttggga 


cgccgtctca 


CACGCAGCGG 


851 


CAGATGCCCG 


TCAAATGTAC 


ATTTGCGGCG 


GCGGCATCCG 


CAATCCTGTT 


901 


TTAATGGCGG 


ATTTGGCAGA 


ATGTTTCGGC 


ACACGCGTTT 


CCCTGCACAG 


951 


CACCGCCGAA 


CTGAACCTCG 


ATCCTCAATG 


GGTGGAGGCG 


gccgCATTtg 


1001 


cgtggttggC 


GGCGTGTTGG 


ATTAACCGCA 


TTCCCGGTAG 


TCCGCACAAA 


1051 


GCGACCGGCG 


CATCCAAACC 


GTGTATTCTG 


GGCGCGGGAT 


ATTATTATTG 


1101 


A 











This corresponds to the amino acid sequence <SEQ ID 1003; ORF 121.ng>: 

gl21.pep 

1 METQLYIGIM SGTSMDGADA VLVRMDGGKW LGAEGHAFTP YPDRLRRKLL 

51 DLQDTGTDEL HRSRMLSQEL SRLYAQTAAE LLCSQNLAPC DITALGCHGQ 

101 TVRHAPEHGY SIQLADLPLL AELTRIFTVG DFRSRDLAAG GQGAPLVPAF 

151 HEALFRDDRE TRWLNIGGI ANISVLPPGA PAFGFDTGPG NMLMDAWTQA 

2 01 HWQLPYDKNG AKAAQGNILP QLLGRLLAHP YFSQPHPKST GRELFALNWL 

251 ETYLDGGENR YDVLRTLSRF TAQTVWDAVS HAAADARQMY ICGGGIRNPV 

301 LMADLAECFG TRVSLHSTAE LNLDPQWVEA AAFAWLAACW INRIPGSPHK 

351 ATGASKPCIL GAGYYY* 



ORF 121 shows 73.5% identity over a 366 aa overlap with a predicted ORF (0RF121.ng) 
from N. gonorrhoeae: 

ml21/gl21 
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ml21 .pep 
gl21 

ml21 .pep 
gl21 

ml21.pep 
gl21 

ml21.pep 
gl21 

inl21 .pep 
gl21 

ml21.pep 
gl21 

ml21.pep 
gl21 



METQLYIGIMSGTSMDGflDAVLIRMDGGKWLGAEGHAFTPYPGRLRRQLLDLQDTGADEL 
METQLYIGIMSGTSMDGADAVLVRMDGGKWLGAEGHAFTPYPDRLRRKLLDLQDTGTDEL 



10 
70 



20 



30 



50 



60 



!0 90 ICO 110 120 

HRSRILSQELSRLYAQTAAELLCSQNLAPSDITALGCHGQTVRHAPEHGYSIQLADLPLL 

NMHIIIIIIIIIIIII nil I I I I II I I II II I I II I I II I II I I I I I II 

HRSRMLSQELSRLYAQTAAELLCSQNLAPCDITALGCHGQTVRHAPEHGYSIQLADLPLL 
70 80 90 100 110 120 

130 140 150 160 170 180 

AXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 
I : : : 

AELTRIFTVGDFRSRDLAAGGQGAPLVPAFHEALFRDDRETRWLNIGGIANISVLPPGA 
130 140 150 160 170 180 

190 200 210 220 230 240 

XXXXXXXXXXXXXXXXXXXXXXQLPYDKNGAKSAQGNILPQLLDRLLAHPYFAQRHPKST 

PAFGFDTGPGNMLMDAWTQAHWQLPYDKNGAKAAQGNILPQLLGRLLAHPYFSQPHPKST 
190 200 210 220 230 240 

250 260 270 280 290 300 

GRELFAINWLETYLDGGENRYDVLRTLSRFTAQTVCDAVSHAAADARQMYICDGGIRNPV 

GRELFALNWLETYLDGGENRYDVLRTLSRFTAQTVWDAVSHTiAADARQMYICGGGIRNPV 
250 260 270 280 290 300 

310 320 330 340 350 360 

LMADLAECFGTRVSLHSTADLNLDPQWVEAAXFAWLAACWINRIPGSPHKATGASKPCIL 

I I I I I I I I I I I I I I : I I I I I I I II I I I I I I I I I I 

LMADLAECFGTRVSLHSTAELNLDPQWVEAAAFAWLAACWINRIPGSPHKATGASKPCIL 
310 320 330 340 350 360 

XAGYYYX 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 1004>: 

al21.seq 

1 ATGGAAACAC AGCTTTACAT CGGCATCATG TCGGGAACCA GCATGGACGG 

51 GGCGGATGCC GTACTGATAC GGATGGACGG CGGCAAATGG CTGGGCGCGG 

101 AAGGGCACGC CTTTACCCCC TACCCCGGCA GGTTACGCCG CAAATTGCTG 

151 GATTTGCAGG ACACAGGCGC GGACGAACTG CACCGCAGCA GGATGTTGTC 

201 GCAAGAACTC AGCCGCCTGT ACGCGCAAAC CGCCGCCGAA CTGCTGTGCA 

251 GTCAAAACCT CGCGCCGTCC GACATTACCG CCCTCGGCTG CCACGGGCAA 

301 ACCGTCAGAC ACGCGCCGGA ACACAGTTAC AGCGTRCAGC TTGCCGATTT 

351 GCCGCTGCTG GCGGAACGGA CTCAGATTTT TACCGTCGGC GACTTCCGCA 

401 GCCGCGACCT TGCGGCCGGC GGACAAGGCG CGCCGCTCGT CCCCGCCTTT 

451 CACGAAGCCC TGTTCCGCGA CGACAGGGAA ACACGCGCGG TACTGAACAT 

501 CGGCGGGATT GCCAACATCA GCGTACTCCC CCCCGACGCA CCCGCCTTCG 

551 GCTTCGACAC AGGACCGGGC AATATGCTGA TGGACGCGTG GATGCAGGCA 

601 CACTGGCAGC TTCCTTACGA CAAAAACGGT GCAAAGGCGG CACAAGGCAA 

651 CATATTGCCG CAACTGCTCG ACAGGCTGCT CGCCCACCCG TATTTCGCAC 

701 AACCCCACCC TAAAAGCACG GGGCGCGAAC TGTTTGCCCT AAATTGGCTC 

751 GAAACCTACC TTGACGGCGG CGAAAACCGA TACGACGTAT TGCGGACGCT 

801 TTCCCGATTC ACCGCGCAAA CCGTTTTCGA CGCCGTCTCA CACGCAGCGG 

851 CAGATGCCCG TCAAATGTAC ATTTGCGGCG GCGGCATCCG CAATCCTGTT 

901 TTAATGGCGG ATTTGGCAGA ATGTTTCGGC ACACGCGTTT CCCTGCACAG 

951 CACCGCCGAA CTGAACCTCG ATCCGCAATG GGTAGAAGCC GCCGCGTTCG 

1001 CATGGATGGC GGCGTGTTGG GTCAACCGCA TTCCCGGTAG TCCGCACAAA 

1051 GCAACCGGCG CATCCAAACC GTGTATTCTG GGCGCGGGAT ATTATTATTG 
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This corresponds to the amino acid sequence <SEQ ID 1005; ORF 121. a>: 

al21.pep 

1 METQLYIGIM SGTSMDGADA VLIRMDGGKW LGAEGHAFTP YPGRLRRKLL 

51 DLQDTGADEL HRSRMLSQEL SRLYAQTAAE LLCSC3NLAPS DITALGCHGQ 

101 TVRHAPEHSY SVQLADLPLL AERTQIFTVG DFRSRDLAAG GQGAPLVPAF 

151 HEALFRDDRE TRAVLNIGGI ANISVLPPDA PAFGFDTGPG NMLMDAWMQA 

201 HWQLPYDKNG AKAAQGNILP QLLDRLLAHP YFAQPHPKST GRELFALNWL 

251 ETYLDGGENR YDVLRTLSRF TAQTVFDAVS HAAADARQMY ICGGGIRNPV 

301 LMADLAECFG TRVSLHSTAE LNLDPQWVEA AAFAWMAACW VNRIPGSPHK 

351 ATGASKPCIL GAGYYY* 

ml21/al21 ORFs 121 and 121. a 74.0% identity in 366 aa overlap 



ml21.pep 
al21 



METQLYIGIMSGTSMDGADAVLIRMDGGKWLGAEGHAFTPYPGRLRRQLLDLQDTGADEL 

I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I 

METQLYIGIMSGTSMDGADAVLIRMDGGKWLGAEGHAFTPYPGRLRRKLLDLQDTGADEL 



ml21.pep 
al21 



HRSRILSQELSRLYAQTAAELLCSQNLAPSDITALGCHGQTVRHAPEHGYSIQLADLPLL 



ml21.pep 
al21 



AXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 



inl21 .pep 
al21 



ml21.pep 
al21 



ml21.pep 
al21 



190 200 210 220 230 240 

XXXXXXXXXXXXXXXXXXXXXXQLPYDKNGAKSAQGNILPQLLDRLLAHPYFAQRHPKST 
: I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I 

PAFGFDTGPGNMLMDAWMQAHWQLPYDKNGAKAAQGNILPQLLDRLLAHPYFAQPHPKST 

190 200 210 220 230 240 

250 260 270 280 290 300 

GRELFAINWLETYLDGGENRYDVLRTLSRFTAQTVCDAVSHAAADARQMYICDGGIRNPV 
I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GRELFALNWLETYLDGGENRYDVLRTLSRFTAQTVFDAVSHAAADARQMYICGGGIRNPV 

250 260 270 280 290 300 



310 



320 



330 



340 



350 



350 



LMADLAECFGTRVSLHSTADLNLDPQWVEAAXFAWLAACWINRIPGSPHKATGASKPCIL 



ml21 .pep 
al21 



XAGYYYX 
I I I I I I 
GAGYYYX 



Further work revealed the DNA 

ml21-l.seq 



identified in N. meningitidis <SEQ ID 1006>: 



ATGGAAACAC AGCTTTACAT CGGCATCATG TCGGGAACCA GCATGGACGG 
GGCGGATGCC GTACTGATAC GGATGGACGG CGGCAAATGG CTGGGCGCGG 
AAGGGCACGC CTTTACCCCC TACCCCGGCA GGTTACGCCG CCAATTGCTG 
GATTTGCAGG ACACAGGCGC AGACGAACTG CACCGCAGCA GGATTTTGTC 
GCAAGAACTC AGCCGCCTAT ATGCGCAAAC CGCCGCCGAA CTGCTGTGCA 
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251 GTCAAAACCT CGCACCGTCC GACATTACCG CCCTCGGCTG CCACGGGCAA 

301 ACCGTCCGAC ACGCGCCGGA ACACGGTTAC AGCATACAGC TTGCCGATTT 

351 GCCGCTGCTG GCGGAACGGA CGCGGATTTT TACCGTCGGC GACTTCCGCA 

4 01 GCCGCGACCT TGCGGCCGGC GGACAAGGCG CGCCACTCGT CCCCGCCTTT 

4 51 CACGAAGCCC TGTTCCGCGA CAACAGGGAA ACACGCGCGG TACTGAACAT 

501 CGGCGGGATT GCCAACATCA GCGTACTCCC CCCCGACGCA CCCGCCTTCG 

551 GCTTCGACAC AGGGCCGGGC AATATGCTGA TGGACGCGTG GACGCAGGCA 

601 CACTGGCAGC TTCCTTACGA CAAAAACGGT GCAAAGGCGG CACAAGGCAA 

651 CATATTGCCG CAACTGCTCG ACAGGCTGCT CGCCCACCCG TATTTCGCAC 

701 AACCCCACCC TAAAAGCACG GGGCGCGAAC TGTTTGCCCT AAATTGGCTC 

751 GAAACCTACC TTGACGGCGG CGAAAACCGA TACGACGTAT TGCGGACGCT 

801 TTCCCGTTTT ACCGCGCAAA CCGTTTGCGA CGCCGTCTCA CACGCAGCGG 

851 CAGATGCCCG TCAAATGTAC ATTTGCGGCG GCGGCATCCG CAATCCTGTT 

901 TTAATGGCGG ATTTGGCAGA ATGTTTCGGC ACACGCGTTT CCCTGCACAG 

951 CACCGCCGAC CTGAACCTCG ATCCGCAATG GGTGGAAGCC GCCGNATTTG 

1001 CGTGGTTGGC GGCGTGTTGG ATTAATCGCA TTCCCGGTAG TCCGCACAAA 

1051 GCAACCGGCG CATCCAAACC GTGTATTCTG ANCGCGGGAT ATTATTATTG 

1101 A 

This corresponds to the amino acid sequence <SEQ ID 1007; ORF 121-1>: 

ml21-l.pep 

1 METQLYIGIM SGTSMDGADA VLIRMDGGKW LGAEGHAFTP YPGRLRRQLL 

51 DLQDTGADEL HRSRILSQEL SRLYAQTAAE LLCSQNLAPS DITALGCHGQ 

101 TVRHAPEHGY SIQLADLPLL AERTRIFTVG DFRSRDLAAG GQGAPLVPAF 

151 HEALFRDNRE TRAVLNIGGI ANISVLPPDA PAFGFDTGPG NMLMDAWTQA 

201 HWQLPYDKNG AKAAQGNILP QLLDRLLAHP YFAQPHPKST GRELFALNWL 

251 ETYLDGGENR YDVLRTLSRF TAQTVCDAVS HAAADARQMY ICGGGIRNPV 

301 LMADLAECFG TRVSLHSTAD LNLDPQWVEA AXFAWLAACW INRIPGSPHK 

351 ATGASKPCIL XAGYYY* 

ml21-l/gl21 ORFs 121-1 and 121-1. ng showed a 95.6% identity in 3 65 aa 

10 20 30 40 50 60 

ml2 1-1 . pep METQLYIGIMSGTSMDGADAVLIRMDGGKWLGAEGHAFTPYPGRLRRQLLDLQDTGADEL 
lllllllllllllllllllllhlllllllllllllllllll 1111:11111111:111 
gl2 1 METQLYIGIMSGTSMDGADAVLVRMDGGKWLGAEGHAFTPYPDRLRRKLLDLQDTGTDEL 

10 20 30 40 50 60 



70 80 90 100 110 120 

ml2 1-1 . pep HRSRILSQELSRLYAQTAAELLCSQNLAPSDITALGCHGQTVRHAPEHGYSIQLADLPLL 

lllhllllllllllllllllllllllll I HIM IIIIMIIII 

gl21 HRSRMLSQELSRLYAQTAAELLCSQNLAPCDITALGCHGQTVRHAPEHGYSIQLADLPLL 

70 80 90 100 110 120 

130 140 150 160 170 180 

ml21-l . pep AERTRIFTVGDFRSRDLAAGGQGAPLVPAFHEALFRDNRETRAVLNIGGIANISVLPPDA 

gl21 AELTRIFTVGDFRSRDLAAGGQGAPLVPAFHEALFRDDRETRVVLNIGGIANISVLPPGA 
130 140 150 160 170 180 



190 200 210 220 230 240 

na2 1-1 . pep PAFGFDTGPGNMLMDAWTQAHWQLPYDKNGAiyVAQGNILPQLLDRLLAHPYFAQPHPKST 

gl2 1 PAFGFDTGPGNMLMDAWTQAHWQLPYDKNGAKAAQGNILPQLLGRLlJ^PYFSQPHPKS^ 
190 200 210 220 230 240 

250 260 270 280 290 300 

ml2 1-1 . pep GRELFALNWLETYLDGGENRYDVLRTLSRFTAQTVCDAVSHAAADARQMYICGGGIRNPV 

llllllll Mill Illlllllll I I 

gl2 1 GRELFALNWLETYLDGGENRYDVLRTLSRFTAQTVWDAVSHAAADARQMYICGGGIRNPV 

250 250 270 280 290 300 
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310 320 330 340 350 360 

ml21-l.pep LMADLAECFGTRVSLHSTADLNLDPQWVEAAXFAWLAACWINRIPGSPHKATGASKPCIL 

IIIMI I I I I I I h I I I I I I I I I I I I I I I I I I I I I I I I I 

gl21 LMADLAECFGTRVSLHSTAELNLDPQWVEAAAFAWLAACWINRIPGSPHKATGASKPCIL 

310 320 330 340 350 360 



ml21-l.pep XAGYYYX 
I I I I I I 

gl21 GAGYYYX 

The following partial DNA sequence was identified in N. meningitidis <SEQ ID 1008>: 

al21-l.seq 

1 ATGGAAACAC AGCTTTACAT CGGCATCATG TCGGGAACCA GCATGGACGG 

51 GGCGGATGCC GTACTGATAC GGATGGACGG CGGCAAATGG CTGGGCGCGG 

101 AAGGGCACGC CTTTACCCCC TACCCCGGCA GGTTACGCCG CAAATTGCTG 

151 GATTTGCAGG ACACAGGCGC GGACGAACTG CACCGCAGCA GGATGTTGTC 

2 01 GCAAGAACTC AGCCGCCTGT ACGCGCAAAC CGCCGCCGAA CTGCTGTGCfl 

251 GTCAAAACCT CGCGCCGTCC GACATTACCG CCCTCGGCTG CCACGGGCAA 

301 ACCGTCAGAC ACGCGCCGGA ACACAGTTAC AGCGTACAGC TTGCCGATTT 

351 GCCGCTGCTG GCGGAACGGA CTCAGATTTT TACCGTCGGC GACTTCCGCA 

4 01 GCCGCGACCT TGCGGCCGGC GGACAAGGCG CGCCGCTCGT CCCCGCCTTT 

451 CACGAAGCCC TGTTCCGCGA CGACAGGGAA ACACGCGCGG TACTGAACAT 

501 CGGCGGGATT GCCAACATCA GCGTACTCCC CCCCGACGCA CCCGCCTTCG 

551 GCTTCGACAC AGGACCGGGC AATATGCTGA TGGACGCGTG GATGCAGGCA 

601 CACTGGCAGC TTCCTTACGA CAAAAACGGT GCAAAGGCGG CACAAGGCAA 

651 CATATTGCCG CAACTGCTCG ACAGGCTGCT CGCCCACCCG TATTTCGCAC 

701 AACCCCACCC TAAAAGCflCG GGGCGCGAAC TGTTTGCCCT AAATTGGCTC 

7 51 GAAACCTACC TTGACGGCGG CGAAAACCGA TACGACGTAT TGCGGACGCT 

801 TTCCCGATTC ACCGCGCAAA CCGTTTTCGA CGCCGTCTCA CACGCAGCGG 

851 CAGATGCCCG TCAAATGTAC ATTTGCGGCG GCGGCATCCG CAATCCTGTT 

901 TTAATGGCGG ATTTGGCAGA ATGTTTCGGC ACACGCGTTT CCCTGCACAG 

951 CACCGCCGAA CTGAACCTCG ATCCGCAATG GGTAGAAGCC GCCGCGTTCG 

1001 CATGGATGGC GGCGTGTTGG GTCAACCGCA TTCCCGGTAG TCCGCACAAA 

1051 GCAACCGGCG CATCCAAACC GTGTATTCTG GGCGCGGGAT ATTATTATTG 

1101 A 

This corresponds to the amino acid sequence <SEQ ID 1009; ORF 121-l.a>: 

al21-l .pep 

1 METQLYIGIM SGTSMDGADA VLIRMDGGKW LGAEGHAFTP YPGRLRRKLL 

51 DLQDTGADEL HRSRMLSQEL SRLYAQTAAE LLCSQNLAPS DITALGCHGQ 

101 TVRHAPEHSY SVQLADLPLL AERTQIFTVG DFRSRDLAAG GQGAPLVPAF 

151 HEALFRDDRE TRAVLNIGGI ANISVLPPDA PAFGFDTGPG NMLMDAWMQA 

201 HWQLPYDKNG AKAAQGNILP QLLDRLLAHP YFAQPHPKST GRELFALNWL 

251 ETYLDGGENR YDVLRTLSRF TAQTVFDAVS HAAADARQMY ICGGGIRNPV 

301 LMADLAECFG TRVSLHSTAE LNLDPQWVEA AAFAWMAACW VNRIPGSPHK 

351 ATGASKPCIL GAGYYY* 

ml21-l/al21-l ORFs 121-1 and 121-1. a showed a 96.4% identity in 366 aa overlap 

10 20 30 40 50 60 

ml21-l . pep METQLYIGIMSGTSMDGADAVLIRMDGGKWLGAEGHAFTPYPGRLRRQLLDLQDTGADEL 

al21-l METQLYIGIMSGTSMDGADAVLIRMDGGKWLGAEGHAFTPYPGRLRRKLLDLQDTC 

10 20 30 40 50 60 

70 3C 90 100 110 120 

ml21-l.pep HRSRILSQELSRLYAC'TA.!VELLCSQNLAPSD:TALGCHGQTVRHAPEHGYSIQLADLPLL 

al2 1-1 HRSRMLSQELSRLYAQTAAELLCSQNLAPSDITALGCHGQTVRHAPEHSYSVQIADLPL^ 
70 80 90 100 110 120 
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ml21-l .pep AERTRIFTVGDFRSRDLAAGGQGAPLVPAFHEALFRDNRETRAVLNIGGIANISVLPPDA 
I I I I : I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I 
al21-l AERTQIFTVGDFRSRDLAAGGQGAPLVPAFHEALFRDDRETRAVLNIGGIANISVLPPDA 
130 140 150 160 170 180 

190 200 210 220 230 240 

ml2 1- 1 . pep PAFGFDTGPGNMLMDAWTQAHWQLPYDBCNGAKAAQGNILPQLLDRLLAHPYFAQPHPKST 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
al21-l PAFGFDTGPGNMLMDAWMQAHWQLPYDKNGAKAAQGNILPQLLDRLLAHPYFAQPHPKST 

190 200 210 220 230 240 

250 260 270 280 290 300 

inl21-l.pep GRELFALNWLETYLDGGENRYDVLRTLSRFTAQTVCDAVSHAAADARQMYICGGGIRNPV 



ml21-l.pep XAGYYYX 
I I I I I I 

al21 GAGYYYX 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 101 0>: 



inl28 . seg 


(partial) 


1 


ATGACTGACA 


51 


AATCAAAACC 


101 


CGCGCGAACA 


151 


AACACTGTCG 


201 


GGGCGTGGTG 


251 


CCGTCTATAA 


301 


GGACAAGACA 


351 


CXSAATTCGAC 


1 


TACGCCAGCG 


51 


wGTCAAAAAA 


101 


AAmTCAAAAA 


151 


TGGCACAAAG 


201 


AGGCGGCGTT 


251 


CGTGGATGAA 


301 


CAAyTGCCCA 


351 


CAGGGAAGCC 


401 


CCGGACACGG 


451 


TCCGGCATCA 


501 


TATGGAAAAT 


551 


ACGAAGAAAC 


601 


GCCGCCAAAA 


651 


CGCCCTCTTT 


701 


AAAACTGGCA 


751 


CAGCCGCCCG 


801 


AGGCGGCTAT 
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851 GCGCGGACGC ATACGCOSCC TTTGAAGAAA GCGACGATGT CGCCGCCACA 

901 GGCAAACGCT TTTGGCAGGA AATCCTCGCC GTCX3GGGnAT CGCGCAGCGG 

951 nGCAGAATCC TTCAAAGCCT TCCGCGGCCG CGAACCGAGC ATAGACGCAC 

1001 TCTTGCGCCA CAGCGGTTTC GACAACGCGG TCTGA 

This corresponds to the amino acid sequence <SEQ ID 101 1; ORF 128>: 



// 



• pep 


(partial) 










1 


MTDNALLHLG 


EEPRFDQIKT 


EDIKPALQTA 


lAEAREQIAA 


IKAQTHTGWA 


51 


NTVEPLTGIT 


ERVGRIWGW 


SHIxNCVADTP 


ELRAVYNELM 


PEITVFFTEI 


101 


GQDIELYNRF 


KTIKNSPEFD 


TLSPAQKTKXi 


NH 




1 


YASEKLREAK 


YAFSETXVKK 


YFPVGXVLNG 


LFAQXKKLYG 


IGFTEKTVPV 


51 


WHKDVRYXEL 


QQNGEXIGGV 


YMDLYAREGK 


RGGAWMNDYK 


GRRRFSDGTL 


101 


QLPTAYLVCN 


FAPPVGGREA 


RliSHDEILIL 


FHETGHGLHH 


LLTQVDELGV 


151 


SGINGVXWDA 


VELPSQFMEN 


FVWEYNVLAQ 


XSAHEETGVP 


LPKELXDKXL 


201 


AAKNFQXGMF 


XVRQXEFALF 


DMMIYSEDDE 


GRLKNWQQVl, 


DSVRKKVAVI 


251 


QPPEYNRFAL 


SFGHIFAGGY 


SAAXYSYAWA 


EVLSADAYAA 


FEESDDVAAT 


301 


GKRFWQEILA 


VGXSRSGAES 


FKAFRGREPS 


IDALLRHSGF 


DNAV* 



The following partial DNA sequence was identified in N. gonorrhoeae <SEQ ID 1012>: 



gl28.seq 














atgattgaca 


acg c gc 


ccacttgggc 


gaagaaccCC 


GTTTTaatca 


51 




gaagACAtca 


AACCCGCCGT 


CCAAACCGCC 


ATCGCCGAAG 




CGCGCGGACA 


AATCGCCGCC 


GTCAAAGCGC 


AAACGCACAC 


CGGCTGGGCG 


151 


AACACCGTC6 






GAACGCGTCG 


GCAGGATTTG 












GAACTGCGCG 










CCGTCTTCTT 


CACCGAAATC 




GGACAAGACA 


TCGAACTGTA 


CAACCGCTTC 




AAAATTCCCC 


351 


CGAATTTGCA 


ACGCTTTCCC 


CCGCACAAAA 


AACCAAGCTC 




401 


TGCGCGATTT 


CGTATTGAGC 


GGCGCGGAAC 


TGCCGCCCGA 


ACGGCAGGCA 


451 


GAACTGGCAA 


AACTGCAAAC 


CGAAGGCGCG 


CAACTTTCCG 


CCAAATTCTC 


501 


CCAAAACGTC 


CTAGACGCGA 


CCGACGCGTT 


CGGCATTTAC 


TTTGACGATG 


551 


CCGCACCGCT 


TGCCGGCATT 


CCCGAAGACG 


CGCTCGCCAT 


GTTTGCCGCC 


601 


GCCGCGCAAA 


GCGAAGGCAA 


AACAGGTTAC 


AAAATCGGCT 


TGCAGATTCC 


651 


GCACTACCTT 


GCCGTTATCC 


AATACGCCGG 


CAACCGCGAA 


CTGCGCGAAC 


701 


AAATCTACCG 


CGCCTACGTT 


ACCCGTGCCA 


GCGAACTTTC 


AAACGACGGC 


751 


AAATTCGACA 


ACACCGCCAA 


CATCGACCGC 


ACGCTCGAAA 


ACGCATTGAA 


801 


AACCGccaaa 


OTGCTCGGCT 


TTAAAAATTA 


CGCCGAATTG 


TCGCTGGCAA 


851 


CCAAAATGGC 


GGACACGCCC 


GAACAGGTTT 


TAAACTTCCT 


GCACGACCTC 


901 


GCCCGCCGCG 


CCAAACCCTA 


CGCCGAAAAA 


GACCTCGCCG 


AAGTCAAAGC 


951 


CTTCGCCCGC 


GAACACCTCG 


GTCTCGCCGA 


CCCGCAGCCG 


TGGGACTTGA 


1001 


GCTACGCCGG 


CGAAAAACTG 


CGCGAAGCCA 


AATACGCATT 


CAGCGAAACC 


1051 


GAAGTCAAAA 


AATACTTCCC 


CGTCGGCAAA 


GTTCTGGCAG 


GCCTGTTCGC 


1101 


CCAAATCAAA 


AAACTCTACG 


GCATCGGATT 


CGCCGAAAAA 


ACCGTTCCCG 


1151 


TCTGGCACAA 


AGACGTGCGC 


TATTTTGAAT 


TGCAACAAAA 


CGGCAAAACC 


1201 


ATCGGCGGCG 


TTTATATGGA 


TTTGTACX3CA 


CGCGAAGGCA 


AACGCGGCGG 


1251 


CGCGTGGATG 


AACGACtaca 


AAGGCCGCCG 


CCGCTTTGCC 


GACGgcacGC 


1301 


TGCAACTGCC 


CACCGCCTAC 


CTCGTCTGCA 


ACTTCGCCCC 


GCCCGTCGGC 


1351 


GGCAAAGAAG 


CGCGTTTAAG 


CCACGACGAA 


ATCCTCACCC 


TCTTCCACGA 


1401 


AacCGGCCAC 


GGACTGCACC 


ACCTGCTTAC 


CCAAGTGGAC 


GAACTGGGCG 


1451 


TGTCCGGCAT 


CAAcggcgtA 


GAATGGGACG 


CGGTCGAACT 


GCCCAGCCAG 


1501 


TTTATGGAAA 


ACTTCGTTTG 


GGAATACAAT 


GTATTGGCAC 


AAATGTCCGC 


1551 


CCACXSAAGAA 


AccgGCGAGC 


CCCTGCCGAA 


AGAACTCTTC 


GACAAAATGC 


1601 


TcgcCGCCAA 


AAACTTCCAG 


CGCGGTATGT 


TCCTCGTCCG 


GCAAATGGAG 


1651 


TTCGCCCTCT 


TCGATATGAT 


GATTTACAGT 


GAAAGCGACG 


AATGCCGTCT 


1701 


GAAAAACTGG 


CAGCAGGTTT 


TAGACAGCGT 


GCGCAAAGAA 


GTcGCCGTCA 


1751 


TCCAACCGCC 


CGAATACAAC 


CGCTTCGCCA 


ACAGCTTCGG 


CCacatctTC 


1801 


GCcggcGGCT 


ATTCCGCAGG 


CTATTACAGC 


TACGCATGGG 


CCGAAGTCCt 


1851 


CAGCACCGAT 


GCCTACGCCG 


CCTTTGAAGA 


AAGcGACGac 


gtcGCCGCCA 
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1901 CAGGCAAACG CTTCTGGCAA GAAAtccttg ccgtcggcgg ctCCCGCAGC 

1951 gcgGCGGAAT CCTTCAAAGC CTTCCGCGGA CGCGAACCGA GCATAGACGC 

2001 ACTGCTGCGC CAaagcggtT TCGACAACGC gGCttgA 

This corresponds to the amino acid sequence <SEQ ID 1013; ORF 128.ng>: 

gl28 .pep 

1 MIDNALLHLG EEPRFNQIQT EDIKPAVQTA lAEARGQIAA VKAQTHTGWA 

51 NTVERLTGIT ERVGRIWGW SHLNSWDTP ELRAVYNELM PEITVFFTEl 

101 GQDIELYNRF KTIKNSPEFA TLSPAQKTKL DHDLRDFVLS GAELPPERQA 

151 ELAKLQTEGA QLSAKFSQNV LDATDAFGIY FDDAAPLAGI PEDALAMFAA 

201 AAQSEGKTGY KIGLQIPHYL AVIQYAGNRE LREQIYRAYV TRASELSNDG 

251 KFDNTANIDR TLENALKTAK LLGFKNYAEL SLATKMADTP EQVLNFLHDL 

3 01 ARRAKPYAEK DLAEVKAFAR EHLGLADPQP WDLSYAGEKL REAKYAFSET 

3 51 EVKKYFPVGK VLAGLFAQIK KLYGIGFAEK TVPVWHKDVR YFELQQNGKT 

4 01 IGGVYMDLYA REGKRGGAWM NDYKGRRRFA DGTLQLPTAY LVCNFAPPVG 
451 GKEARLSHDE ILTLFHETGH GLHHLLTQVD ELGVSGINGV EWDAVELPSQ 
501 FMENFVWEYN VLAQMSAHEE TGEPLPKELF DKMLAAKNFQ RGMFLVRQME 
551 FALPDMMIYS ESDECRLKNW QQVLDSVRKE VAVIQPPEYN RPANSPGHIF 
601 AGGYSAGYYS YAWAEVLSTD AYAAFEESDD VAATGKRFWQ EILAVGGSRS 
651 AAESFKAFRG REPSIDALLR QSGFDNAA* 



ORF 128 shows 91 .7% identity over a 475 aa overlap with a predicted ORF (ORF 128.ng) 

from N. gonorrhoeae: 

25 ml28/gl28 

10 20 30 40 50 60 

MIDNALLHLGEEPRFNQIQTEDIKPAVQTAIAEARGQIAAVKAQTHTGWANTVERLTGIT 
I llllllllllllhlhllllllhllllllll lllhlllllllllllM IIIM 
MTDNALLHLGEEPRFDQIKTEDIKPALQTAIAEAREQIAAIKAQTHTGWANTVEPLTGIT 
10 20 30 40 50 60 



gl28 .pep 
30 ml28 



70 80 90 100 110 120 

ERVGRIWGWSHLNSWDTPELRAVYNELMPEITVFFTEIGQDIELYNRFKTIKNSPEFA 

MIMMIIIMII hlllllllllllMMIIIIIIIIIIIMIIIMIIIIIIIII 

ERVGRIWGWSHLNCVADTPELRAVYNELMPEITVFFTEIGQDIELYNRFKTIKNSPEFD 
70 80 90 100 110 120 



130 140 150 160 170 180 

TLSPAQKTKLDHDLRDFVLSGAELPPERQAELAKLQTEGAQLSTVKFSQNVIiDATDAFGIY 
lllllllllhl 
TLSPAQKTKUilH 

130 



340 350 360 

YAGEKLREAKYAFSETEVKKYFPVGKVLAG 

IhlllllMIIIIII llllllll II I 

YASEKLREAKYAFSETXVKKYFPV6XVLKG 



10 



20 



30 



370 380 390 400 410 420 

LFAQIKKLYGIGFAEKTVPVWHKDVRYFELQQNGKTIGGVYMDLYAREGKRGGAWMNDYK 

MM MMIIIMIIIIIIIIIIIII MIIIMMIMIIIMIMIIIIIIMIIM 

LFAQXKKLYGIGFTEKTVPVWHKDVRYXELQQNGEXIGGVYMDLYAREGKRGGAWMNDYK 



60 



70 



90 



gl28 .pep 



430 440 450 460 470 480 

GRRRFADGTLQLPTAYLVCNFAPPVGGKEARLSHDEILTLFHETGHGLHHLLTQVDELGV 
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llllhllllllllllllllllllllhllllllllM IIIIIMIIMIIIIIIIIM 

GRRRFSDGTLQLPTAYLVCNFAPPVGGREARLSHDEILILFHETGHGLHHLLTQVDELGV 
100 110 120 130 140 150 

490 500 510 520 530 540 

SGINGVEWDAVELPSQFMENFWEYNVIiAQMSAHEETGEPLPKELFDKMLAAKNFQRGMF 

IIMII IIIMIIIIIIIIIIIIIIIIM lllllll mill M IIIIIM III 

SGINGVXWDAVELPSQFMENFVWEYNVLAQXSAHEETGVPLPKELXDKXLAAKNFQXGMF 
160 170 180 190 200 210 

550 560 570 580 590 600 

LVRQMEFALFDMMIYSESDECRLKNWQQVLDSVRKEVAVIQPPEYNRFANSFGHIFAGGY 
III IIIIIMIIIIhll MIIIIIMIMIhlllllllllMII llllllllll 
XVRQXEFALFDMMIYSEDDEGRLKNWQQVLDSVRKKVAVIQPPEYNRFALSFGHIFAGGY 
220 230 240 250 260 270 

610 620 630 640 650 660 

SAGYYSYAWAEVLSTDAYAAFEESDDVAATGKRFWQEILAVGGSRSAAESFKAFRGREPS 

Ih IIIIIIMIhllMIIIIIIMIIIIIIIIIIIIIII llhlllllllllllll 

SAAXYSYAWAEVLSADAYAAFEESDDVAATGKRFWQEILAVGXSRSGAESFKAFRGREPS 
280 290 300 310 320 330 

670 679 
I DALLRQSGFDNAAX 

IIMIIHIIIIh 

IDALLRHSGFDNAVX 
340 

The following partial DNA sequence was identified in N. meningitidis <SEQ ID 1014>: 



al28 . seq 












1 


ATGACTGACA 


ACGCACTGCT 


CCATTTGGGC 


GAAGAACCCC 


GTTTTGATCA 


51 


AATCAAAACC 


GAAGACATCA 


AACCCGCCCT 


GCAAACCGCC 


ATTGCCGAAG 


101 


CGCGCGAACA 


AATCGCCGCC 


ATCAAAGCCC 


AAACGCACAC 


CGGCTGGGCA 


151 


AACACTGTCG 


AACCCCTGAC 


CGGCATCACC 


GAACGCGTCG 


GCAGGATTTG 


201 


GGGCGTGGTG 


TCGCACCTCA 


ACTCCGTCAC 


CGACACGCCC 


GAACTGCGCG 


251 


CCGCCTACAA 


TGAATTAATG 


CCCGAAATTA 


CCGTCTTCTT 


CACCGAAATC 


301 


GGACAAGACA 


TCGAGCTGTA 


CAACCGCTTC 


AAAACCATCA 


AAAACTCCCC 


351 


CGAGTTCGAC 


ACCCTCTCCC 


ACGCGCAAAA 


AACCAAACTC 


AACCACGATC 


401 


TGCGCGATTT 


CGTCCTCAGC 


GGCGCGGAAC 


TGCCGCCCGA 


ACAGCAGGCA 


451 


GAATTGGCAA 


AACTGCAAAC 


CGAAGGCGCG 


CAACTTTCCG 


CCAAATTCTC 


501 


CCAAAACGTC 


CTAGACGCGA 


CCGACGCGTT 


CGGCATTTAC 


TTTGACGATG 


551 


CCGCACCGCT 


TGCCGGCATT 


CCCGAAGACG 


CGCTCGCCAT 


GTTTGCCGCT 


601 


GCCGCGCAAA 


GCGAAGGCAA 


AACAGGCTAC 


AAAATCGGTT 


TGCAGATTCC 


651 


GCACTACCTC 


GCCGTCATCC 


AATACGCCGA 


CAACCGCAAA 


CTGCGCGAAC 


701 


AAATCTACCG 


CGCCTACGTT 


ACCCGCGCCA 


GCGAGCTTTC 


AGACGACGGC 


751 


AAATTCGACA 


ACRCCGCCAA 


CATCGACCGC 


ACGCTCGAAA 


ACGCCCTGCA 


801 


AACCGCCAAA 


CTGCTCGGCT 


TCAAAAACTA 


CGCCGAATTG 


TCGCTGGCAA 


851 


CCAAAATGGC 


GGACACCCCC 


GAACAAGTTT 


TAAACTTCCT 


GCACGACCTC 


901 


GCCCGCCGCG 


CCAAACCCTA 


CGCCGAAAAA 


GACCTCGCCG 


AAGTCAAAGC 


951 


CTTCGCCCGC 


GAAAGCCTCG 


GCCTCGCCGA 


TTTGCAACCG 


TGGGACTTGG 


1001 


GCTACGCCGG 


CGAAAAACTG 


CGCGAAGCCA 


AATACGCATT 


CAGCGAAACC 


1051 


GAAGTCAAAA 


AATACTTCCC 


CGTCGGCAAA 


GTATTAAACG 


GACTGTTCGC 


1101 


CCAAATCAAA 


AAACTCTACG 


GCATCGGATT 


TACCGAAAAA 


ACCGTCCCCG 


1151 


TCTGGCACAA 


AGACGTGCGC 


TATTTTGAAT 


TGCAACAAAA 


CGGCGAAACC 


1201 


ATAGGCGGCG 


TTTATATGGA 


TTTGTACGCA 


CGCGAAGGCA 


AACGCGGCGG 


1251 


CGCGTGGATG 


AACGACTACA 


AAGGCCGCCG 


CCGTTTTTCA 


GACGGCACGC 


1301 


TGCAACTGCC 


CACCGCCTAC 


CTCGTCTGCA 


ACTTCACCCC 


GCCCGTCGGC 


1351 


GGCAAAGAAG 


CCCGCTTGAG 


CCATGACGAA 


ATCCTCACCC 


TCTTCCACGA 


1401 


AACCGGACAC 


GGCCTGCACC 


ACCTGCTTAC 


CCAAGTCGAC 


GAACTGGGCG 


1451 


TATCCGGCAT 


CAACGGCGTA 


GAATGGGACG 


CAGTCGAACT 


GCCCAGTCAG 


1501 


TTTATGGAAA 


ATTTCGTTTG 


GGAATACAAT 


GTCTTGGCGC 


AAATGTCCGC 



gl28 .pep 
ml28 

gl28 ,pep 
nil28 

gl28 .pep 
inl28 

gl28 .pep 
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1551 CCACGAAGAA ACCGGCGTTC 

1601 TCGCCGCCAA AAACTTCCAA 

1651 TTCGCCCTCT TTGATATGAT 

1701 GAAAAACTGG CAACAGGTTT 

17 51 TCCGACCGCC CGAATACAAC 

1801 GCAGGCGGCT ATTCCGCAGG 

1851 GAGCGCGGAC GCATACGCCG 

1901 CAGGCAAACG CTTTTGGCAG 

1951 GCGGCAGAAT CCTTCAAAGC 

2001 ACTCTTGCGC CACAGCGGCT 



CCCTGCCGAA AGAACTCTTC GACAAAATGC 
CGCGGAATGT TCCTCGTCCG CCAAATGGAG 
GATTTACAGC GAAGACGACG AAGGCCGTCT 
TAGACAGCGT GCGCAAAGAA GTCGCCGTCG 
CGCTTCGCCA ACAGCTTCGG CCACATCTTC 
CTATTACAGC TACGCGTGGG CGGAAGTATT 
CCTTTGAAGA AAGCGACGAT GTCGCCGCCA 
GAAATCCTCG CCGTCGGCGG ATCGCGCAGC 
CTTCCGCGGA CGCGAACCGA GCATAGACGC 
TCGACAACGC GGCTTGA 



This corresponds to the amino acid sequence <SEQ ID 1015; ORF 128.a>: 

al28.pep 

1 MTDNALLHLG EEPRFDQIKT EDIKPALQTA lAEAREQIAA IKAQTHTGWA 

51 NTVEPLTGIT ERVGRIWGW SHLNSVTDTP ELRAAYNELM PEITVFFTEI 

101 GQDIELYNRF KTIKNSPEFD TLSHAQKTKL NHDLRDFVLS GAELPPEQQA 

151 ELAKLQTEGA QLSAKFSQNV LDATDAFGIY FDDAAPLAGI PEDALAMFAA 

201 AAQSEGKTGY KIGLQIPHYL AVIQYADNRK LREQIYRAYV TRASELSDDG 

251 KFDNTANIDR TLENALQTAK LLGFKNYAEL SLATKMADTP EQVLNFLHDL 

301 ARRAKPYAEK DLAEVKAFAR ESLGLADLQP WDLGYAGEKL REAKYAFSET 

351 EVICKYFPVGK VLNGLFAQIK KLYGIGFTEK TVPVWHKDVR YFELQQNGET 

4 01 IGGVYMDLYA REGKRGGAWM NDYKGRRRFS DGTLQLPTAY LVCNFTPPVG 

451 GKEARLSHDE ILTLFHETGH GLHHLLTQVD ELGVSGINGV EWDAVELPSQ 

501 FMENFVWEYN VLAQMSAHEE TGVPLPKELF DKMLAAKNFQ RGMFLVRQME 

551 FALFDMMIYS EDDEGRLKNW QQVLDSVRKE VAWRPPEYN RFANSFGHIF 

601 AGGYSAGYYS YAWAEVLSAD AYAAFEESDD VAATGKRFWQ EILAVGGSRS 

651 AAESFKAFRG REPSIDALLR HSGFDNAA* 



ml28/al28 ORFs 128 and 128.a showed a 66.0% identity in 677 aa overlap 

10 20 30 40 50 60 

ml2 8 . pep MTDNALLHLGEEPRFDQIKTEDIKPALQTAIAEAREQIAAIKAQTHTGWANTVEPLTGIT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I I I I 
al28 MTDNALLHLGEEPRFDQIKTEDIKPALQTAIAEAREQIAAIKAQTHTGWANTVEPLTGIT 

10 20 30 40 50 60 



70 80 90 100 110 120 

ml28.pep ERVGRIWGWSHLNCVADTPELRAVYNELMPEITVFFTEIGQDIELYNRFKTIKNSPEFD 

I I I I I I I I I I I I I I I : I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 1 
al28 ERVGRIWGWSHLNSVTDTPELRAAYNELMPEITVFFTEIGQDIELYNRFKTIKNSPEFD 
70 80 90 100 110 120 



130 

ml28.pep TLS PAQKTKLNH 

III II I I I I I I 

al28 TLSHAQKTKLNHDLRDFVLSGAELPPEQQAELAKLQTEGAQLSAKFSQNVLDATDAFGIY 
130 140 150 160 170 180 



inl28.pep 
al28 



ml28.pep 
al28 



ml28.pep 



140 150 
-YASEKLREAKYAFSETXVKKYFPVGX 
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al28 ARRAKPYAEKDLAEVKAFARESLGLADLQPWDLGYAGEKLREAKYAFSETEVKKYFPVGK 
310 320 330 340 350 360 

160 170 180 190 200 210 

ml2 8 . pep VLNGLFAQXKKLYGIGFTEKTVPVWHKDVRYXELQQNGEXIGGVYMDLYAREGKRGGAWM 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I 
al28 VLNGLFAQIKKLYGIGFTEKTVPVWHKDVRYFELQQNGETIGGVYMDLYAREGKRGGAWM 
370 380 390 400 410 420 

220 230 240 250 260 270 

ml28.pep NDYKGRRRFSDGTLQLPTAYLVCNFAPPVGGREARLSHDEILILFHETGHGLHHLLTQVD 

al28 NDYKGRRRFSDGTLQLPTAYLVCNFTPPVGGKEARLSHDEILTLFHETGHGLHHLLTQVD 
430 440 450 460 470 480 

280 290 300 310 320 330 

ml28 . pep elgvsgingvxwdavelpsqfmenfvweynvlaqxsaheetgvplpkelxdkxlaaknfq 

al28 elgvsgingvewdavelpsqfmenfvweynvlaqmsaheetgvplpkelfdkmlaaknfq 

490 500 510 520 530 540 

340 350 360 370 380 390 

ml28 . pep XGMFXVRQXEFALFDMMIYSEDDEGRLKNWQQVLDSVRKKVAVIQPPEYNRFALSFGHIF 
III III llllllllllllllllllllllllllllll:|l|::||llllll llllll 
al28 RGMFLVRQMEFALFDMMIYSEDDEGRLKNWQQVLDSVRKEVAWRPPEYNRFANSFGHIF 
550 560 570 580 590 600 

400 410 420 430 440 450 

ml28 .pep AGGYSAAXYSYAWAEVLSADAYAAFEESDDVAATGKRFWQEILAVGXSRSGAESFKAFRG 

al2 8 AGGYSAGYYSYAWAEVLSADAYAAFEESDDVAATGKRFWQEILAVGGSRSAAESFKAFRG 
610 620 630 640 650 660 

460 470 
ml2 8 . pep REPSIDALLRHSGFDNAVX 

al28 REPSIDALLRHSGFDNAAX 
670 



Further work revealed the DNA sequence identified in A^. meningitidis <SEQ ID 1016>: 

ml28-l.seq 

1 ATGACTGACA ACGCACTGCT CCATTTGGGC GAAGAACCCC GTTTTGATCA 

51 AATCAAAACC GAAGACATCA AACCCGCCCT GCAAACCGCC ATCGCCGAAG 

101 CGCGCGAACA AATCGCCGCC ATCAAAGCCC AAACGCACAC CGGCTGGGCA 

151 AACACTGTCG AACCCCTGAC CGGCATCACC GAACGCGTCG GCAGGATTTG 

201 GGGCGTGGTG TCGCACCTCA ACTCCGTCGC CGACACGCCC GAACTGCGCG 

251 CCGTCTATAA CGAACTGATG CCCGAAATCA CCGTCTTCTT CACCGAAATC 

301 GGACAAGACA TCGAGCTGTA CAACCGCTTC AAAACCATCA AAAATTCCCC 

351 CGAATTCGAC ACCCTCTCCC CCGCACAAAA AACCAAACTC AACCACGATC 

401 TGCGCGATTT CGTCCTCAGC GGCGCGGAAC TGCCGCCCGA ACAGCAGGCA 

451 GAACTGGCAA AACTGCAAAC CGAAGGCGCG CAACTTTCCG CCAAATTCTC 

501 CCAAAACGTC CTAGACGCGA CCGACGCGTT CGGCATTTAC TTTGACGATG 

551 CCGCACCGCT TGCCGGCATT CCCGAAGACG CGCTCGCCAT GTTTGCCGCC 

601 GCCGCGCAAA GCGAAAGCAA AACAGGCTAC AAAATCGGCT TGCAGATTCC 

651 ACACTACCTC GCCGTCATCC AATACGCCGA CAACCGCGAA CTGCGCGAAC 

7 01 AAATCTACCG CGCCTACGTT ACCC3CGCCA GCGAACTTTC AGACGACGGC 

7 51 AAATTCGACA ACACCGCC?-A CATCGACCGC ACGCTCGCAA ACGCCCTGCA 

801 AACCGCCAAA CTGCTCGGCT TCAAAAACTA CGCCGAATTG TCGCTGGCAA 

851 CCAAAATGGC GGACACGCCC GAACAAGTTT TAAACTTCCT GCACGACCTC 
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901 GCCCGCCGCG CCAAACCCTA CGCCGAAAAA GACCTCGCCG AAGTCAAAGC 

951 CTTCGCCCGC GAAAGCCTGA ACCTCGCCGA TTTGCAACCG TGGGACTTGG 

1001 GCTACGCCAG CGAAAAACTG CGCGAAGCCA AATACGCGTT CAGCGAAACC 

1051 GAAGTCAAAA AATACTTCCC CGTCGGCAAA GTATTAAACG GACTGTTCGC 

1101 CCAAATCAAA AAACTCTACG GCATCGGATT TACCGAAAAA ACCGTCCCCG 

1151 TCTGGCACAA AGACGTGCGC TRTTTTGAAT TGCAACAAAA CGGCGAAACC 

1201 ATAGGCGGCG TTTATATGGA TTTGTACGCA CGCGAAGGCA AACGCGGCGG 

1251 CGCGTGGATG AACGACTACA AAGGCCGCCG CCGTTTTTCA GACGGCACGC 

1301 TGCAACTGCC CACCGCCTAC CTCGTCTGCA ACTTCGCCCC ACCCGTCGGC 

1351 GGCAGGGAAG CCCGCCTGAG CCACGACGAA ATCCTCATCC TCTTCCACGA 

1401 AACCGGACAC GGGCTGCACC ACCTGCTTAC CCAAGTGGAC GAACTGGGCG 

1451 TATCCGGCAT CAACGGCGTA GAATGGGACG CGGTCGAACT GCCCAGCCAG 

1501 TTTATGGAAA ATTTCGTTTG GGAATACAAT GTCTTGGCAC AAATGTCAGC 

1551 CCACGAAGAA ACCGGCGTTC CCCTGCCGAA AGAACTCTTC GACAAAATGC 

1601 TCGCCGCCAA AAACTTCCAA CGCGGCATGT TCCTCGTCCG GCAAATGGAG 

1651 TTCGCCCTCT TTGATATGAT GATTTACAGC GAAGACGACG AAGGCCGTCT 

1701 GAAAAACTGG CAACAGGTTT TAGACAGCGT GCGCAAAAAA GTCGCCGTCA 

1751 TCCAGCCGCC CGAATACAAC CGCTTCGCCT TGAGCTTCGG CCACATCTTC 

1801 GCAGGCGGCT ATTCCGCAGG CTATTACAGC TACGCGTGGG CGGAAGTATT 

1851 GAGCGCGGAC GCATACGCCG CCTTTGAAGA AAGCGACGAT GTCGCCGCCA 

1901 CAGGCAAACG CTTTTGGCAG GAAATCCTCG CCGTCGGCGG ATCGCGCAGC 

1951 GCGGCAGAAT CCTTCAAAGC CTTCCGCGGC CGCGAACCGA GCATAGACGC 

2001 ACTCTTGCGC CACAGCGGTT TCGACAACGC GGTCTGA 

This corresponds to the amino acid sequence <SEQ ID 1017; ORF 128-1>: 
ml28-l.pep. 

1 MTDNALLHLG EEPRFDQIKT EDIKPALQTA lAEAREQIAA IKAQTHTGWA 

51 NTVEPLTGIT ERVGRIWGW SHLNSVADTP ELRAVYNELM PEITVFFTEI 

101 GQDIELYNRF KTIKNSPEFD TLSPAQKTKL NHDLRDFVLS GAELPPEQQA 

151 ELAKLQTEGA QLSAKFSQNV LDATDAFGIY FDDAAPLAGI PEDALAMFAA 

201 AAQSESKTGY KIGLQIPHYL AVIQYADNRE LREQIYRAYV TRASELSDDG 

251 KFDNTANIDR TLANALQTAK LLGFKNYAEL SLATKMADTP EQVLNFLHDL 

301 ARRAKPYAEK DLAEVKAFAR ESLNLADLQP WDLGYASEKL REAKYAFSET 

351 EVKKYFPVGK VLNGLFAQIK KLYGIGFTEK TVPVWHKDVR YFELQQNGET 

401 IGGVYMDLYA REGKRGGAWM NDYKGRRRFS DGTLQLPTAY LVCNFAPPVG 

451 GREARLSHDE ILILFHETGH GLHHLLTQVD ELGVSGINGV EWDAVELPSQ 

501 FMENFVWEYN VLAQMSAHEE TGVPLPKELF DKMLAAKNFQ RGMFLVRQME 

551 FALFDMMIYS EDDEGRLKNW QQVLDSVRKK VAVIQPPEYN RFALSFGHIF 

601 AGGYSAGYYS YAWAEVLSAD AYAAFEESDD VAATGKRFWQ EILAVGGSRS 

651 AAESFKAFRG REPSIDALLR HSGFDNAV* 

The following partial DNA sequence was identified in N. gonorrhoeae <SEQ ED 1018>: 



gl28-l.seq (partial) 








1 


ATGATTGACA 


ACGCACTGCT 


CCACTTGGGC 


GAAGAACCCC 


GTTTTAATCA 


51 


AATCAAAACC 


GAAGACATCA AACCCGCCGT 


CCAAACCGCC 


ATCGCCGAAG 


101 


CGCGCGGACA 


AATCGCCGCC 


GTCAAAGCGC 


AAACGCACAC 


CGGCTGGGCG 


151 


AACACCGTCG 


AGCGTCTGAC 


CGGCATCACC 


GAACGCGTCG 


GCAGGATTTG 


201 


GGGCGTCGTG 


TCCCATCTCA ACTCCGTCGT 


CGACACGCCC 


GAACTGGGCG 


251 


CCGTCTATAA 


CGAACTGATG 


CCTGAAATCA 


CCGTCTTCTT 


CACCGAAATC 


301 


GGACAAGACA 


TCGAACTGTA 


CAACCGCTTC 


AAAACCATCA 


AAAATTCCCC 


351 


CGAATTTGCA 


ACGCTTTCCC 


CCGCACAAAA 


AACCAAGCTC 


GATCACGACC 


401 


TGCGCGATTT 


CGTATTGAGC 


GGCGCGGAAC 


TGCCGCCCGA 


ACGGCAGGCA 


451 


GAACTGGCAA 


AACTGCAAAC 


CGAAGGCGCG 


CAACTTTCCG 


CCAAATTCTC 


501 


CCAAAACGTC 


CTAGACGCGA 


CCGACGCGTT 


CGGCATTTAC 


TTTGACGATG 


551 


CCGCACCGCT 


TGCCGGCATT 


CCCGAAGACG 


CGCTCGCCAT 


GTTTGCCGCC 


601 


GCCGCGCAAA 


GCGAAGGCAA 


AACAGGTTAC 


AAAATCGGCT 


TGCAGATTCC 


651 


GCACTACCTT 


GCCGTTATCC 


AATACGCCGG 


CAACCGCGAA 


CTGCGCGAAC 


701 


AAATCTACCG 


CGCCTACGTT 


ACCCGTGCCA 


GCGAACTTTC 


AAACGACGGC 


751 


AAATTCGACA 


ACACCGCCAA 


CATCGACCGC 


ACGCTCGAAA 


ACGCATTGAA 


801 


AACCGCCAAA 


CTGCTCGGCT 


TTAAAAATTA 


CGCCGAATTG 


TCGCTGGCAA 


851 


CCAAAATGGC 


GGACACGCCC 


GAACAGGTTT 


TAAACTTCCT 


GCACGACCTC 


901 


GCCCGCCGCG 


CCAAACCCTA 


CGCCGAAAAA 


GACCTCGCCG 


AAGTCAAAGC 
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951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 



CTTCGCCCGC 
GCTACGCCGG 
GAAGTCAAAA 
CCAAATCAAA 
TCTGGCACAA 
ATCGGCGGCG 
CGCGTGGATG 
TGCAACTGCC 
GGCAAAGAAG 
AACCGGCCAC 
TGTCCGGCAT 



GAACACCTCG 
CGAAAAACTG 
AATACTTCCC 
AAACTCTACG 
AGACGTGCGC 
TTTATATGGA 
AACGACTACA 
CACCGCCTAC 
CGCGTTTAAG 
GGACTGCACC 
CAACGGCGTA 



GTCTCGCCGA 
CGCGAAGCCA 
CGTCGGCAAA 
GCATCGGATT 
TATTTTGAAT 
TTTGTACGCA 
AAGGCCGCCG 
CTCGTCTGCA 
CCACGACGAA 
ACCTGCTTAC 
AAA 



CCCGCAGCCG 
AATACGCATT 
GTTCTGGCAG 
CGCCGAAAAA 
TGCAACAAAA 
CGCGAAGGCA 
CCGCTTTGCC 
ACTTCGCCCC 
ATCCTCACCC 
CCAAGTGGAC 



TGGGACTTGA 
CAGCGAAACC 
GCCTGTTCGC 
ACCGTTCCCG 
CGGCAAAACC 
AACGCGGCGG 
GACGGCACGC 
GCCCGTCGGC 
TCTTCCACGA 
GAACTGGGCG 



This corresponds to the amino acid sequence <SEQ ID 1019; ORF 128-1. ng>: 

gl28-l.pep (partial) 

1 MIDNALLHLG EEPRFNQIKT EDIKPAVQTA lAEARGQIAA VKAQTHTGWA 

51 NTVERLTGIT ERVGRIWGW SHLNSWDTP ELRAVYNELM PEITVFFTEI 

101 GQDIELYNRF KTIKNSPEFA TLSPAQKTKL DHDLRDFVLS GAELPPERQA 

151 ELAKLQTEGA QLSAKFSQNV LDATDAFGIY FDDAAPLAGI PEDALAMFAA 

201 AAQSEGICTGY KIGLQIPHYL AVIQYAGNRE LREQIYRAYV TRASELSNDG 

251 KFDNTANIDR TLENALKTAK LLGFKNYAEL SLATKMADTP EQVLNFLHDL 

301 ARRAKPYAEK DLAEVKAFAR EHLGLADPQP WDLSYAGEKL REAKYAFSET 

351 EVKKYFPVGK VLAGLFAQIK KLYGIGFAEK TVPVWHKDVR YFELQQNGKT 

4 01 IGGVYMDLYA REGKRGGAWM NDYKGRRRFA DGTLQLPTAY LVCNFAPPVG 

4 51 GKEARLSHDE ILTLFHETGH GLHHLLTQVD ELGVSGINGV K 



ml28-l/gl28-l ORFs 128-1 and 128-1. ng showed a 94.5% identity 



MIDNALLHLGEEPRFNQIKTEDIKPAVQTAIAEARGQIAAVKAQTHTGWANTVERLTGIT 
I lllllllllllll:llllllllll:|lllllll I I I I : I I I I I I I I I I I I I Mill 
MTDNALLHLGEEPRFDQIKTEDIKPALQTAIAEAREQIAAIKAQTHTGWANTVEPLTGIT 



10 



20 



30 



40 



50 



60 



70 80 90 100 110 120 

ERVGRIWGWSHLNSWDTPELRAVYNELMPEITVFFTEIGQDIELYNRFKTIKNSPEFA 

lllllllllllllllhllllllllllllllllllllllllllllllllllllllllll 

ERVGRIWGWSHLNSVADTPELRAVYNELMPEITVFFTEIGQDIELYNRFKTIKNSPEFD 



90 



100 



110 



120 



gl28-l.pep TLSPAQKTKLDHDLRDFVLSGAELPPERQAELAKLQTEGAQLSAKFSQNVLDATDAFGIY 



ARRAKPYAEKDLAEVKAFAREHLGLADPQPWDLSYAGEKLREAKYAFSETEVKKYFPVGK 
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ml 2 8 - 1 ARRAKPYAEKDLAEVKAFARESLNLADLQPWDLGYASEKLREAKYAFSETEVKKYFPVGK 
310 320 330 340 350 360 

370 380 390 400 410 420 

g 1 2 8 - 1 . pep VLAGLFAQIKKLYGIGFAEKTVPVWHKDVRYFELQQNGKTIGGVYMDLYAREGKRGGAWM 

inl28-l VLNGLFAQIKKLYGIGFTEKTVPVWHKDVRYFELQQNGETIGGVYMDLYAREGKRGGAWM 
370 380 390 400 410 420 

430 440 450 460 470 480 

gl28-l.pep NDYKGRRRFADGTLQLPTAYLVCNFAPPVGGKEARLSHDEILTLFHETGHGLHHLLTQVD 
I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ml28-l NDYKGRRRFSDGTLQLPTAYLVCNFAPPVGGREARLSHDEILILFHETGHGLHHLLTQVD 

430 440 450 460 470 480 

490 

gl28-l.pep ELGVSGINGVK 

ml28-l ELGVSGINGVEWDAVELPSQFMENFVWEYNVLAQMSAHEETGVPLPKELFDKMLAAKNFQ 
490 500 510 520 530 540 

The following DNA sequence was identified in N. meningitidis <SEQ ID 1020>: 

al28-l.seq 

1 ATGACTGACA ACGCACTGCT CCATTTGGGC GAAGAACCCC GTTTTGATCA 

51 AATCAAAACC GAAGACATCA AACCCGCCCT GCAAACCGCC ATTGCCGAAG 

101 CGCGCGAACA AATCGCCGCC ATCAAAGCCC AAACGCACAC CGGCTGGGCA 

151 AACACTGTCG AACCCCTGAC CGGCATCACC GAACGCGTCG GCAGGATTTG 

201 GGGCGTGGTG TCGCACCTCA ACTCCSTCAC CGACACGCCC GAACTGCGCG 

251 CCGCCTACAA TGAATTAATG CCCGAAATTA CCGTCTTCTT CACCGAAATC 

301 GGACAAGACA TCGAGCTGTA CAACCGCTTC AAAACCATCA AAAACTCCCC 

351 CGAGTTCGAC ACCCTCTCCC ACGCGCAAAA AACCAAACTC AACCACGATC 

401 TGCGCGATTT CGTCCTCAGC GGCGCGGAAC TGCCGCCCGA ACAGCAGGCA 

451 GARTTGGCAA AACTGCAAAC CGAAGGCGCG CAACTTTCCG CCAAATTCTC 

501 CCAAAACGTC CTAGACGCGA CCGACGCGTT CGGCATTTAC TTTGACGATG 

551 CCGCACCGCT TGCCGGCATT CCCGAAGACG CGCTCGCCAT GTTTGCCGCT 

601 GCCGCGCAAA GCGAAGGCAA AACAGGCTAC AAAATCGGTT TGCAGATTCC 

651 GCACTACCTC GCCGTCATCC AATACGCCGA CAACCGCAAA CTGCGCGAAC 

701 AAATCTACCG CGCCTACGTT ACCCGCGCCA GCGAGCTTTC AGACGACGGC 

751 AAATTCGACA ACACCGCCAA CATCGACCGC ACGCTCGAAA ACGCCCTGCA 

801 AACCGCCAAA CTGCTCGGCT TCAAAAACTA CGCCGAATTG TCGCTGGCAA 

851 CCAAARTGGC GGACACCCCC GAACAAGTTT TAAACTTCCT GCACGACCTC 

901 GCCCGCCGCG CCAAACCCTA CGCCGAAAAA GACCTCGCCG AAGTCAAAGC 

951 CTTCGCCCGC GAAAGCCTCG GCCTCGCCGA TTTGCAACCG TGGGACTTGG 

1001 GCTACGCCGG CGAAAAACTG CGCGAAGCCA AATACGCATT CAGCGAAACC 

1051 GAAGTCAAAA AATACTTCCC CGTCGGCAAA GTATTAAACG GACTGTTCGC 

1101 CCAAATCAAA AAACTCTACG GCATCGGATT TACCGAAAAA ACCGTCCCCG 

1151 TCTGGCACAA AGACGTGCGC TATTTTGAAT TGCAACAAAA CGGCGAAACC 

1201 ATAGGCGGCG TTTATATGGA TTTGTACGCA CGCGAAGGCA AACGCGGCGG 

1251 CGCGTGGATG AACGACTACA AAGGCCGCCG CCGTTTTTCA GACGGCACGC 

1301 TGCAACTGCC CACCGCCTAC CTCGTCTGCA ACTTCACCCC GCCCGTCGGC 

1351 GGCAAAGAAG CCCGCTTGAG CCATGACGAA ATCCTCACCC TCTTCCACGA 

1401 AACCGGACAC GGCCTGCACC ACCTGCTTAC CCAAGTCGAC GAACTGGGCG 

1451 TATCCGGCAT CAACGGCGTA GAATGGGACG CAGTCGAACT GCCCAGTCAG 

1501 TTTATGGAAA ATTTCGTTTG GGAATACART GTCTTGGCGC AAATGTCCGC 

1551 CCACGAAGAA ACCGGCGTTC CCCTGCCGAA AGAACTCTTC GACAAAATGC 

1601 TCGCCGCCAA AAACTTCCAA CGCGGAATGT TCCTCGTCCG CCAAATGGAG 

1651 TTCGCCCTCT TTGATATGAT GATTTACAGC GAAGACGACG AAGGCCGTCT 

17 01 GAAAAACTGG CAACAGGTTT TAGACAGCGT GCGCAAAGAA GTCGCCGTCG 

17 51 TCCGACCGCC CGAATACAAC CGCTTCGCCA ACAGCTTCGG CCACATCTTC 

1801 GCAGGCGGCT ATTCCGCAGG CTATTACAGC TACGCGTGGG CGGAAGTATT 

1851 GAGCGCGGAC GCATACGCCG CCTTTGAAGA AAGCGACGAT GTCGCCGCCA 

1901 CAGGCAAACG CTTTTGGCAG GAAATCCTCG CCGTCGGCGG ATCGCGCAGC 
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This corresponds to the amino acid sequence <SEQ ID 1021; ORF 128-l.a>: 

al28-l.pep 

1 MTDNALLHLG EEPRFDQIKT EDIKPALQTA lAEAREQIAA IKAQTHTGWA 

51 NTVEPLTGIT ERVGRIWGW SHLNSVTDTP ELRAAYNELM PEITVFFTEI 

101 GQDIELYNRF KTIKNSPEFD TLSHAQKTKL NHDLRDFVLS GAELPPEQQA 

151 ELAKLQTEGA QLSAKFSQNV LDATDAFGIY FDDAAPLAGI PEDALAMFAA 

201 AAQSEGKTGY KIGLQIPHYL AVIQYADNRK LREQIYRAYV TRASELSDDG 

251 KFDNTANIDR TLENALQTAK LLGFKNYAEL SLATKMADTP EQVLNFLHDL 

301 ARRAKPYAEK DLAEVKAFAR ESLGLADLQP WDLGYAGEKL REAKYAFSET 

351 EVKKYFPVGK VLNGLFAQIK KLYGIGFTEK TVPVWHKDVR YFELQQNGET 

401 IGGVYMDLYA REGKRGGAWM NDYKGRRRFS DGTLQLPTAY LVCNFTPPVG 

451 GKEARLSHDE ILTLFHETGH GLHHLLTQVD ELGVSGINGV EWDAVELPSQ 

501 FMENFVWEYN VLAQMSAHEE TGVPLPKELF DKMLAAKNFQ RGMFLVRQME 

551 FALFDMMIYS EDDEGRLKNW QQVLDSVRKE VAWRPPEYN RFANSFGHIF 

601 AGGYSAGYYS YAWAEVLSAD AYAAFEESDD VAATGKRFWQ EILAVGGSRS 

651 AAESFKAFRG REPSIDALLR HSGFDNAA* 

ml28-l/al28-l ORFs 128-1 and 128-1. a showed a 97.8% identity in 677 aa overlap 



al28-l.pep 
ml28-l 



MTDNALLHLGEEPRFDQIKTEDIKPALQTAIAEAREQIAAIKAQTHTGWANTVEPLTGIT 
I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
MTDNALLHLGEEPRFDQIKTEDIKPALQTAIAEAREQIAAIKAQTHTGWANTVEPLTGIT 



10 



70 



20 



30 



90 



40 



50 



60 



30 110 120 

ERVGRIWGWSHLNSVTDTPELRAAYNELMPEITVFFTEIGQDIELYNRFKTIKNSPEFD 

Illlll:|||||||:ll I I I I I I I I I I I I i I I I II 

ERVGRIWGWSHLNSVADTPELRAVYNELMPEITVFFTEIGQDIELYNRFKTIKNSPEFD 
70 80 90 100 110 120 



130 



140 



150 



150 



170 



al28-l.pep TLSHAQKTKLNHDLRDFVLSGAELPPEQQAELAKLQTEGAQLSTUCFSQNVLDATDAFGI^ 



190 200 210 220 230 240 

FDDAAPLAGIPEDALAMFAAAAQSEGKTGYKIGLQIPHYLAVIQYADNRKLREQIYRAYV 
I I I I I I I I I I I I I I I I I I I I I II I I : I I I I II I I I I I I I I I I I I I I I I I : I I I I I I I I I I 
FDDAAPLAGIPEDALAMFAAAAQSESKTGYKIGLQIPHYLAVIQYADNRELREQIYRAYV 

190 200 210 220 230 240 

250 260 270 280 290 300 

TRASELSDDGKFDNTANIDRTLENALQTAKLLGFKNYAELSLATKMADTPEQVLNFLHDL 
I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I II I I I II I I I I II I I I II I II I II I 
TRASELSDDGKFDNTANIDRTLANflLQTAKLLGFKNYAELSLATKMADTPEQVLNFLHDL 

250 260 270 280 290 300 



370 380 390 400 410 420 

VLNGLFAQIKKLYGIGFTEKTVPVWHKDVRYFELQQNGETIGGVYMDLYAREGKRGGAWM 

VLNGLFAQIKKLYGIGFTEKTVPWHKDVRYFELQQNGETIGGVYMDLYAREGKRGGAmll 
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370 380 390 400 410 420 

430 440 450 460 470 480 

NDYKGRRRFSDGTLQLPTAYLVCNFTPPVGGKEARLSHDEILTLFHETGHGLHHLLTQVD 

I I I I I I I I I I I I I I I I I 

NDYKGRRRFSDGTLQLPTAYLVCNFAPPVGGREARLSHDEILILFHETGHGLHHLLTQVD 

430 440 450 460 470 480 

490 500 510 520 530 540 

ELGVSGINGVEWDAVELPSQFMENFifflEYNVLAQMSRHEETGVPLPKELFDKMLAAKNFQ 

I I I I I I I I I I I I I I I I ! I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I 

ELGVSGINGVEWDAVELPSQFMENnWEYNVLAQMSAHEETGVPLPKELFDKMLAAKNFQ 
490 500 510 520 530 540 

550 560 570 580 590 600 

RGMFLVRQMEFALFDMMIYSEDDEGRLKNWQQVLDSVRKEVAWRPPEYNRFANSFGHIF 
lllllllllllllllllllllllllllllllllllllll:|||::|||||||| |||||| 
RGMFLVRQMEFALFDMMIYSEDDEGRLKNWQQVLDSVRKKVAVIQPPEYNRFALSFGHIF 

550 560 570 580 590 600 

610 620 630 640 650 660 

AGGYSAGYYSYAWAEVLSADAYAAFEESDDVAATGKRFWQEILAVGGSRSAAESFKAFRG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I 

AGGYSAGYYSYAWAEVLSADAYAAFEESDDVAATGKRFWQEILAVGGSRSAAESFKAFRG 

610 620 630 640 650 660 

670 679 
REPSIDALLRHSGFDNAAX 

I Ill I : 

REPSIDALLRHSGFDNAVX 

670 



206 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 1022>: 



m206 . seg 












1 


ATGTTTCCCC 


CCGACAAAAC 


CCTTTTCCTC 


TGTCTCAGCG 


CACTGCTCCT 


51 


CGCCTCATGC 


GGCACGACCT 


CCGGCAAACA 


CCGCCAACCG 


AAACCCAAAC 


101 


AGACAGTCCG 


GCAAATCCAA 


GCCGTCCGCA 


TCAGCCACAT 


CGACCGCACA 


151 


CAAGGCTCGC 


AGGAACTCAT 


GCTCCACAGC 


CTCGGACTCA 


TCGGCACGCC 


201 


CTACAAATGG 


GGCGGCAGCA 


GCACCGCAAC 


CGGCTTCGAT 


TGCAGCGGCA 


251 


TGATTCAATT 


CGTTTACAAr 


AACGCCCTCA 


ACGTCAAGCT 


GCCGCGCACC 


301 


GCCCGCGACA 


TGGCXSGCGGC 


AAGCCGsAAA 


ATCCCCGAcA 


GCCGCyTCAA 


351 


GGCCGGCGAC 


CTCGTATTCT 


TCAACACCGG 


CGGCGCACAC 


CGCTACTCAC 


401 


ACGTCGGACT 


CTACATCGGC 


AACGGCGAAT 


TCATCCATGC 


CCCCAGCAGC 


451 


GGCAAAACCA 


TCAAAACCGA 


AAAACTCTCC 


ACACCGTTTT 


ACGCCAAAAA 


501 


CTACCTCGGC 


GCACATACTT 


TTTTTACAGA 


ATGA 





This corresponds to the amino acid sequence <SEQ ID 1023; ORF 206>: 

m20e .pep. . 

1 MFPPDKTLFL CLSALLLASC GTTSG KHRQP KPKQTVRQIQ AVRISHIDRT 
51 QGSQELMLHS LGLIGTPYKW GGSSTATGPD CSGMIQPVYK NALNVKLPRT 
101 ARDMAAASRK IPDSRXKAGD LVFFNTGGAH RYSHVGLYIG NGEFIHAPSS 
151 GKTIKTEKLS TPFYAKNYLG AHTFFTE* 

The following partial DNA sequence was identified in N. gonorrhoeae <SEQ ID 1024>: 

g206 .seq 

1 atgttttccc ccgacaaaac ccttttcctc tgtctcggcg cactgctcct 



al28-l.pep 
inl28-l 

al28-l.pep 
inl28-l 

al28-l.pep 
ml28-l 

al28-l.pep 
ml28-l 

al28-l.pep 
ml28-l 
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51 cgcctcatgc ggcacgacct ccggcaaaca ccgccaaccg aaacccaaac 

101 agacagtccg gcaaatccaa gccgtccgca tcagccacat cggccgcaca 

151 caaggctcgc aggaactcat gctccacagc ctcggactca tcggcacgcc 

201 ctacaaatgg ggcggcagca gcaccgcaac cggcttogac tgcagoggca 

251 tgattcaatt ggtttacaaa aacgccctca acgtcaagct gccgcgcacc 

301 gcccgcgaca tggcggcggc aagccgcaaa atcoccgaca gccgcctcaa 

351 ggccggcgac atcgtattct tcaacaccgg cggcgcacac cgctactcac 

401 acgtcggact ctacatcggc aacggcgaat tcatccatgc ccccggcagc 

451 ggcaaaacca tcaaaaccga aaaactctcc acaccgtttt acgccaaaaa 

501 ctaccttgga gcgcatacgt tttttacaga atga 

This corresponds to the amino acid sequence <SEQ ID 1025; ORF 206.ng>: 

g206 .pep 

1 MFSPDKTLFL CLGALLLASC GTTSG KHRQP KPKQTVRQIQ AVRISHIGRT 

51 QGSQELMLHS LGLIGTPYKW GGSSTATGFD CSGMIQLVYK NALNVKLPRT 

101 ARDMAAASRK IPDSRLKAGD IVFFNTGGAH RYSHVGLYIG NGEFIHAPGS 

151 GKTIKTEKLS TPFYAKNYLG AHTFFTE* 



ORF 206 shows 96.0% identity over a 177 aa overlap with a predicted ORF (ORF 206.ng) 
from A'', gonorrhoeae: 

ra206/g20e 

10 20 30 40 50 60 

MFPPDKTLFLCLSALLLASCGTTSGKHRQPKPKQTVRQIQAVRISHIDRTQGSQELMLHS 

II llllllllhlllllllllllllllllllllllMIIIMMM IIMMMIIII 

MFSPDKTLFLCLGALLLASCGTTSGKHRQPKPKQTVRQIQAVRISHIGRTQGSQEIiMLHS 
10 20 30 40 50 60 

70 80 90 100 110 120 

LGLIGTPYKWGGSSTATGFDCSGMIQFVYKNALNVKLPRTARDMAAASRKIPDSRXKAGD 

illllllllllllllllllllllllhlllllllllllllllllMIIIIIIIIMIII 

LGLIGTPYKWGGSSTATGFDCSGMIQLVYKNALNVKLPRTARDMAAASRKIPDSRLKAGD 
70 80 90 100 110 120 

130 140 150 ISO 170 

LVFFNTGGAHRYSHVGLYIGNGEFIHAPSSGKTIKTEKLSTPFYAKNYLGAHTFFTEX 

MIIIMIIIIMUIMIIIIIIIIIIHIIIIIIIMIIIIIIIIIIIIMIMI 

IVFFNTGGAHRYSHVGLYIGNGEFIHAPGSGKTIKTEKLSTPFYAKNYLGAHTFFTE 
130 140 150 160 170 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 1026>: 
a206. seq 

1 ATGTTTCCCC CCGACAAAAC CCTTTTCCTC TGTCTCAGCG CACTGCTCCT 

51 CGCCTCATGC GGCACGACCT CCGGCAAACA CCGCCAACCG AAACCCAAAC 

101 AGACAGTCCG GCAAATCCAA GCCGTCCGCA TCAGCCACAT CGACCGCACA 

151 CAAGGCTCGC AGGAACTCAT GCTCCACAGC CTCGGACTCA TCGGCACGCC 

201 CTACAAATGG GGCGGCAGCA GCACCGCAAC CGGCTTCGAT TGCAGCGGCA 

251 TGATTCAATT CGTTTACAAA AACGCCCTCA ACGTCAAGCT GCCGCGCACC 

301 GCCCGCGACA TGGCGGCGGC AAGCCGCAAA ATCCCCGACA GCCGCCTTAA 

351 GGCCGGCGAC CTCGTATTCT TCAACACCGG CGGCGCACAC CGCTACTCAC 

401 ACGTCGGACT CTATATCGGC AACGGCGAAT TCATCCATGC CCCCAGCAGC 

451 GGCAAAACCA TCAAAACCGA AAAACTCTCC ACACCGTTTT ACGCCAAAAA 

501 CTACCTCGGC GCACATACTT TCTTTACAGA ATGA 



m206 .pep 
g206 

m206 .pep 
g206 

m206 .pep 
9206 



This corresponds to the amino acid sequence <SEQ ID 1027; ORF 206.a>: 

a206.pep 



wo 00/22430 



PCT/US99/23573 



-103- 



1 MFPBD KTLFL CLSALLLASC GTT SGKHRQP KPKQTVRQIQ AVRISHIDRT 

51 QGSQELMLHS LGLIGTPYKW GGSSTATGFD CSGMIQFVYK NALNVKLPRT 

101 ARDMAAASRK IPDSRLKAGD LVFFNTGGAH RYSHVGLYIG NGEFIHAPSS 

151 GKTIKTEKLS TPFYAKNYLG AHTFFTE* 

m206/a206 ORFs 206 and 206.a showed a 99.4% identity in 1 77 aa overlap 

10 20 30 40 50 60 

MFPPDKTLFLCLSALLLASCGTTSGKHRQPKPKQTVRQIQAVRISHIDRTQGSQELMLHS 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

MFPPDKTLFLCLSALLLASCGTTSGKHRQPKPKQTVRQIQAVRISHIDRTQGSQELMLHS 
10 20 30 40 50 60 

70 80 90 100 110 120 

LGLIGTPYKWGGSSTATGFDCSGMIQFVYKNALNVKLPRTARDMAAASRKIPDSRXKAGD 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 n' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 { 1 1 1 1 1 1 1 1 

LGLIGTPYKWGGSSTATGFDCSGMIQFVYKNALNVKLPRTARDMAAASRKIPDSRLKAGD 
70 80 90 100 110 120 

130 140 150 160 170 

LVFFNTGGAHRYSHVGLYIGNGEFIHAPSSGKTIKTEKLSTPFYAKNYLGAHTFFTEX 

lllllllllllllllllllllllllillllll I Illlll 

LVFFNTGGAHRYSHVGLYIGNGEFIHAPSSGKTIKTEKLSTPFYAKNYLGAHTFFTEX 
130 140 150 160 170 



287 



The following partial DNA sequence was identified in A^. meningitidis <SEQ ID 1028>: 

m287.seq 

1 ATGTTTAAAC GCAGCGTAAT CGCAATGGCT TGTATTTTTG CCCTTTCAGC 

51 CTGCGGGGGC GGCGGTGGCG GATCGCCCGA TGTCAAGTCG GCGGACACGC 

101 TGTCAAAACC TGCCGCCCCT GTTGTTTCTG AAAAAGAGAC AGAGGCAAAG 

151 GAAGATGCGC CACAGGCAGG TTCTCAAGGA CAGGGCGCGC CATCCGCACA 

201 AGGCAGTCAA GATATGGCGG CGGTTTCGGA AGAAAATACA GGCAATGGCG 

251 GTGCGGTAAC AGCGGATAAT CCCAAAAATG AAGACGAGGT GGCACAAAAT 

301 GATATGCCGC AAAATGCCGC CGGTACAGAT AGTTCGACAC CGAATCACAC 

351 CCCGGATCCG AATATGCTTG CCGGAAATAT GGAAAATCAA GCAACGGATG 

401 CCGGGGAATC GTCTCAGCCG GCAAACCAAC CGGATATGGC AAATGCGGCG 

451 GACGGAATGC AGGGGGACGA TCCGTCGGCA GGCGGGCAAA ATGCCGGCAA 

501 TACGGCTGCC CAAGGTGCAA ATCAAGCCGG AAACAATCAA GCCGCCGGTT 

551 CTTCAGATCC CATCCCCGCG TCAAACCCTG CACCTGCGAA TGGCGGTAGC 

601 AATTTTGGAA GGGTTGATTT GGCTAATGGC GTTTTGATTG ACGGGCCGTC 

651 GCAAAATATA ACGTTGACCC ACTGTAAAGG CGATTCTTGT AGTGGCAATA 

7 01 ATTTCTTGGA TGAAGAAGTA CAGCTAAAAT CAGAATTTGA AAAATTAAGT 

7 51 GATGCAGACA AAATAAGTAA TTACAAGAAA GATGGGAAGA ATGATAAATT 

801 TGTCGGTTTG GTTGCCGATA GTGTGCAGAT GAAGGGAATC AATCAATATA 

851 TTATCTTTTA TAAACCTAAA CCCACTTCAT TTGCGCGATT TAGGCGTTCT 

901 GCACGGTCGA GGCGGTCGCT TCCGGCCGAG ATGCCGCTGA TTCCCGTCAA 

951 TCAGGCGGAT ACGCTGATTG TCGATGGGGA AGCGGTCAGC CTGACGGGGC 

1001 ATTCCGGCAA TATCTTCGCG CCCGAAGGGA ATTACCGGTA TCTGACTTAC 

1051 GGGGCGGAAA AATTGCCCGG CGGATCGTAT GCCCTTCGTG TTCAAGGCGA 

1101 ACCGGCAAAA GGCGAAATGC TTGCGGGCGC GGCCGTGTAC AACGGCGAAG 

1151 TACTGCATTT CCATACGGAA AACGGCCGTC CGTACCCGAC CAGGGGCAGG 

1201 TTTGCCGCAA AAGTCGRTTT CGGCAGCAAA TCTGTGGACG GCATTATCGA 

1251 CAGCGGCGAT GATTTGCATA TG3GTACGCA AAAATTCAAA GCCGCCATCG 

1301 ATGGAAACGG CTTTAAGGGG ACTTGGACGG AAAATGGCAG CGGGGATGTT 

1351 TCCGGAAAGT TTTACGGCCC GGCCGGCGAG GAAGTGGCGG GAAAATACAG 

1401 CTATCGCCCG ACAGATGCGG AAAAGGGCGG ATTCGGCGTG TTTGCCGGCA 



m206.pep 
a206 

m206 .pep 
a206 

m206.pep 
a206 
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14 51 AAAAAGAGCA GGATTGA 

This corresponds to the amino acid sequence <SEQ ID 1029; ORF 287>: 

in287.pep 

1 MFKRSVIftMA CIFALSA CGG GGGGSPDVKS ADTLSKPAAP WSEKETEAK 

51 EDAPQAGSQG QGAPSAQGSQ DMAAVSEENT GNGGAVTADN PKNEDEVAQN 

101 DMPQNAAGTD SSTPNHTPDP NMLAGNMENQ ATDAGESSQP ANQPDMANAA 

151 DGMQGDDPSA GGQNAGNTAA QGANQAGNNQ AAGSSDPIPA SNPAPANGGS 

2 01 NFGRVDLANG VLIDGPSQNI TLTHCKGDSC SGNNFLDEEV QLKSEFEKLS 

251 DADKISNYKK DGKNDKFVGL VADSVQMKGI NQYIIFYKPK PTSFARFRRE 

301 ARSRRSLPAE MPLIPVNQAD TLIVDGEAVS LTGHSGNIFA PEGNYRYLTY 

351 GAEKLPGGSY ALRVQGEPAK GEMLAGAAVY NGEVLHFHTE NGRPYPTRGR 

401 FAAKVDFGSK SVDGIIDSGD DLHMGTQKFK AAIDGNGFKG TWTENGSGDV 

4 51 SGKFYGPAGE EVAGKYSYRP TDS£KGGFGV FAGKKEQD* 

The following partial DNA sequence was identified in A'^^ gonorrhoeae <SEQ ID 1030>: 

g287.seq 

1 atgtttaaac gcagtgtgat tgcaatggct tg-atttttc ccctttcagc 

51 ctgtgggggc ggcggtggcg gatcgcccga tgtcaagtcg gcggacacgc 

101 cgtcaaaacc ggccgccccc gttgttgctg aaaatgccgg ggaaggggtg 

151 ctgccgaaag aaaagaaaga tgaggaggca gcgggcggtg cgccgcaagc 

201 cgatacgcag gacgcaaccg ccggagaagg cagccaagat atggcggcag 

251 tttcggcaga aaatacaggc aatggcggtg cggcaacaac ggacaacccc 

301 aaaaatgaag acgcgggggc gcaaaatgat atgccgcaaa atgccgccga 

351 atccgcaaat caaacaggga acaaccaacc cgccggttct tcagattccg 

4 01 cccccgcgtc aaaccctgcc cctgcgaatg gcggtagcga ttttggaagg 

451 acgaacgtgg gcaattctgt tgtgattgac ggaccgtcgc aaaatataac 

501 gttgacccac tgtaaaggcg attcttgtaa tggtgataat ttattggatg 

551 aagaagcacc gtcaaaatca gaatttgaaa aattaagtga tgaagaaaaa 

601 attaagcgat ataaaaaaga cgagcaacgg gagaattttg tcggtttggt 

651 tgctgacagg gtaaaaaagg atggaactaa caaatatatc atcttctata 

7 01 cggacaaacc acctactcgt tctgcacggt cgaggaggtc gcttccggcc 

7 51 gagattccgc tgattcccgt caatcaggcc gatacgctga ttgtggatgg 

801 ggaagcggtc agcctgacgg ggcattccgg caatatcttc gcgcccgaag 

851 ggaattaccg gtatctgact tacggggcgg aaaaattgcc cggcggatcg 

901 tatgccctcc gtgtgcaagg cgaaccggca aaaggcgaaa tgcttgttgg 

951 cacggccgtg tacaacggcg aagtgctgca tttccatatg gaaaacggcc 

1001 gtccgtaccc gtccggaggc aggtttgccg caaaagtcga tttcggcagc 

1051 aaatctgtgg acggcattat cgacagcggc gatgatttgc atatgggtac 

1101 gcaaaaattc aaagccgcca tcgatggaaa cggctttaag gggacttgga 

1151 cggaaaatgg cggcggggat gtttccggaa ggttttacgg cccggccggc 

1201 gaggaagtgg cgggaaaata cagctatcgc ccgacagatg ctgaaaaggg 

1251 cggattcggc gtgtttgccg gcaaaaaaga tcgggattga 

This corresponds to the amino acid sequence <SEQ ID 1031; ORF 287.ng>: 

g287.pep 

1 MFKRSVIAMA CIFPLSA CGG GGGGSPDVKS ADTPSKPAAP WAENAGEGV 

51 LPKEKKDEEA AGGAPQADTQ DATAGEGSQD MARVSAENTG NGGAATTDNP 

101 KNEDAGAQND MPQNAAESAN QTGNNQPAGS SDSAPASNPA PANGGSDFGR 

151 TNVGNSWID GPSQNITLTH CKGDSCNGDN LLDEEAPSKS EFEKLSDEEK 

201 IKRYKKDEQR ENFVGLVADR VKKDGTNKYI IFYTDKPPTR SARSRRSLPA 

251 EIPLIPVNQA DTLIVDGEAV SLTGHSGNIF APEGNYRYLT YGAEKLPGGS 

3Q1 YALRVQGEPA KGEMLVGTAV YNGEVLHFHM ENGRPYPSGG RFAAKVDFGS 

351 KSVDGIIDSG DDLHMGTQKF KAAIDGNGFK GTWTENGGGD VSGRFYGPAG 

401 EEVAGKYSYR PTDAEKGGFG VFAGKKDRD* 



m287/g287 ORFs 287 and 287.ng showed a 70.1% identity in 499 aa overlap 
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MFKRSVIAMACIFALSACGGGGGGSPDVKSADTLSKPAAPWSE KETEA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I : II 

MFKRSVIAMACIFPLSACGGGGGGSPDVKSADTPSKPAAPVVAENAGEGVLPKEKKDEEA 
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KEDAPQAGSQGQGAPSAQGSQDMAAVSEENTGNGGAVTADNPKNEDEVAQNDMPQNAAGT 

1111:1 I :::|llllllll I I I I I I I I : I : I I II I II illlllllll 
AGGAPQADTQD — ATAGEGSQDMAAVSAENTGNGGAATTDNPKNEDAGAQNDMPQNAA — 
70 80 90 100 110 

110 120 130 140 150 160 169 

DSSTPNHTPDPNMLAGNMENQATDAGESSQPANQPDMANAADGMQGDDPSAGGQNAGNTA 



170 180 190 200 210 220 229 

AQGANQAGNNQAAGSSDPIPASNPAPANGGSNFGRVDLANGVLIDGPSQNITLTHCKGDS 

-ESANQTGNNQPAGSSDSAPASNPAPANGGSDFGRTNVGNSWIDGPSQNITLTHCKGDS 
120 130 140 150 160 170 

230 240 250 260 270 280 289 

CSGNNFLDEEVQLKSEFEKLSDADKISNYKKDGKNDKFVGLVADSVQMKGINQYII FYKP 



350 360 370 380 390 400 409 

YGAEKLPGGSYALRVQGEPAKGEMLAGAAVYNGEVLHFHTENGRPYPTRGRFAAKVDFGS 
llillllllllllllllllllllll:|:||||||||||| lllllll: lllllllllll 
YGAEKLPGGSYALRVQGEPAKGEMLVGTAVYNGEVLHFHMENGRPYPSGGRFAAKVDFGS 
300 310 320 330 340 350 

410 420 430 440 450 450 469 

KSVDGIIDSGDDLHMGTQKFKAAIDGNGFKGTWTENGSGDVSGKFYGPAGEEVAGKYSYR 



in287.pep 
g287 



PTDAEKGGFGVFAGKKEQDX 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 1032>: 

a287 . seq 

1 ATGTTTAAAC GCAGTGTGAT TGCAATGGCT TGTATTGTTG CCCTTTCAGC 

51 CTGTGGGGGC GGCGGTGGCG GATCGCCCGA TGTTAAGTCG GCGGACACGC 

101 TGTCAAAACC TGCCGCCCCT GTTGTTACTG AAGATGTCGG GGAAGAGGTG 

151 CTGCCGAAAG AAAAGAAAGA TGAGGAGGCG GTGAGTGGTG CGCCGCAAGC 

201 CGATACGCAG GACGCAACCG CCGGAAAAGG CGGTCAAGAT ATGGCGGCAG 

251 TTTCGGCAGA AAATACAGGC AATGGCGGTG CGGCAACAAC GGATAATCCC 



wo 00/22430 



PCT/US99/23573 



-106- 



301 GAAAATAAAG ACGAGGGACC GCAAAATGAT ATGCCGCAAA ATGCCGCCGA 

351 TACAGATAGT TCGACflCCGA ATCACACCCC TGCACCGAAT ATGCCAACCA 

4 01 GAGATATGGG AAACCAAGCA CCGGATGCCG GGGAATCGGC ACAACCGGCA 

451 AACCAACCGG ATATGGCAAA TGCGGCGGAC GGAATGCAGG GGGACGATCC 

501 GTCGGCAGGG GAAAATGCCG GCflATACGGC AGATCAAGCT GCAAATCAAG 

551 CTGAAAACAA TCAAGTCGGC GGCTCTCAAft ATCCTGCCTC TTCAACCAAT 

601 CCTAACGCCA CGAATGGCGG CAGCGATTTT GGAAGGATAA ATGTAGCTAA 

651 TGGCATCAAG CTTGACAGCG GTTCGGAAAA TGTAACGTTG ACACATTGTA 

701 AAGACAAAGT ATGCGATAGA GATTTCTTAG ATGAAGAAGC ACCACCAAAA 

751 TCAGAATTTG AAAAATTAAG TGATGAAGAA AAAATTAATA AATATAAAAA 

801 AGACGAGCAA CGAGAGAATT TTGTCGGTTT GGTTGCTGAC AGGGTAGAAA 

851 AGAATGGAAC TAACAAATAT GTCATCATTT ATAAAGACAA GTCCGCTTCA 

901 TCTTCATCTG CGCGATTCAG GCGTTCTGCA CGGTCGAGGC GGTCGCTTCC 

951 GGCCGAGATG CCGCTGATTC CCGTCAATCA GGCGGATACG CTGATTGTCG 

1001 ATGGGGAAGC GGTCAGCCTG ACGGGGCATT CCGGCAATAT CTTCGCGCCC 

1051 GAAGGGAATT ACCGGTATCT GACTTACGGG GCGGAAAAAT TGTCCGGCGG 

1101 ATCGTATGCC CTCAGTGTGC AAGGCGAACC GGCAAAAGGC GAAATGCTTG 

1151 CGGGCACGGC CGTGTACAAC GGCGAAGTGC TGCATTTCCA TATGGAAARC 

1201 GGCCGTCCGT CCCCGTCCGG AGGCAGGTTT GCCGCAAAAG TCGATTTCGG 

1251 CAGCAAATCT GTGGACGGCA TTATCGACAG CGGCGATGAT TTGCATATGG 

1301 GTACGCAAAA ATTCAAAGCC GTTATCGATG GAAACGGCTT TAAGGGGACT 

1351 TGGACGGAAA ATGGCGGCGG GGATGTTTCC GGAAGGTTTT ACGGCCCGGC 

1401 CGGCGAAGAA GTGGCGGGAA AATACAGCTA TCGCCCGACA GATGCGGAAA 

1451 AGGGCGGATT CGGCGTGTTT GCCGGCAAAA AAGAGCAGGA TTGA 



This corresponds to the amino 

a287.pep 



<SEQ ID 1033; ORF 287.a>: 



MFKRSVIAMA CIVALSACGG 



51 LPKEKKDEEA 
101 ENKDEGPQND 
NQPDMANAAD 
PNATNGGSDF 
SEFEKLSDEE 
SSSARFRRSA 
351 EGNYRYLTYG 
401 GRPSPSGGRF 
451 WTENGGGDVS 



251 



VSGAPQADTQ 
MPQNAADTDS 
GMQGDDPSAG 
GRINVANGIK 
KINKYKKDEQ 
RSRRSLPAEM 
AEKLSGGSYA 
AAKVDFGSKS 
GRFYGPAGEE 



GGGGSPDVKS 
DATAGKGGQD 
STPNHTPAPN 
ENAGNTADQA 
LDSGSENVTL 
RENFVGLVAD 
PLIPVNQADT 
LSVQGEPAKG 
VDGIIDSGDD 
VAGKYSYRPT 



ADTLSKPAAP 
MAAVSAENTG 
MPTRDMGNQA 
ANQAENNQVG 
THCKDKVCDR 
RVEKNGTNKY 
LIVDGEAVSL 
EMLAGTAVYN 
LHMGTQKFKA 
DAEKGGFGVF 



WTEDVGEEV 
NGGAATTDNP 
PDAGESAQPA 
GSQNPA3STN 
DFLDEEAPPK 
VIIYKDKSAS 
TGHSGNIFAP 
GEVLHFHMEN 
VIDGNGFKGT 
AGKKEQD* 



GRFs 287 and 287. a showed i 



77.2% identity : 



501 i 



MFKRSVIAMACIFALSACGGGGGGSPDVKSADTLSKPAAPWSE KETEA 

llllllllllll lllllllllllllllllllllllllllll:! I: II 

MFKRSVIAMACIVALSACGGGGGGSPDVKSADTLSKPAAPWTEDVGEEVLPKEKKDEEA 



10 



20 



30 



40 



50 



60 



50 



60 70 80 90 100 109 

KEDAPQAGSQGQGAPSAQGSQDMAAVSEENTGNGGAVTADNPKNEDEVAQNDMPQNAAGT 
I I I I : I I : :: I : I I I I I I I I I II I I I I : I : I I I : I : I I I I I I I II II I 
VSGAPQADTQ— DATAGKGGQDMAAVSAENTGNGGAATTDNPENKDEGPQNDMPQNAADT 



70 



90 



100 



110 



110 120 130 140 150 160 169 

m287 . pep DSSTPNHTPDPNMLAGNMENQATDAGESSQPANQPDMANAADGMQGDDPSAGGQNAGNTA 

a287 DSSTPNHTPAPNMPTRDMGNQAPDAGESAQPANQPDMANAADGMQGDDPSAG-ENAGNTA 
120 130 140 150 160 170 



170 180 190 200 210 220 229 

ni287 . pep aqganqagnnqaagssdpipasnpapanggsnfgrvdlangvlidgpsqnitlthckgds 

a2 87 dqaanqaennqvggsqnpasstnpnatnggsdfgrinvangikldsgsenvtlthckdkv 
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in287.pep 
a287 



m287 .pep 
a287 



!90 300 310 320 330 340 

KP— TSFARFRRSARSRRSLPAEMPLIPVNQADTLIVDGEAVSLTGHSGNIFAPEGNYRY 



KSRSSSSARFRRSARSRRSLPAEMPLIPVNQADTLIVDGEAVSLTGHSGNIFAPEGNYRY 



m287 .pep 
a287 



m287.pep 
a287 



m287.pep 
a287 



LTYGAEKLPGGSYALRVQGEPAKGEMLAGAAVYNGEVLHFHTENGRPYPTRGRFAAKVDF 
llllllll llllll I I I I I I I I I I I I I: I I I I I I I I I I I Hill I: lllllllll 
LTYGAEKLSGGSYALSVQGEPAKGEMLAGTAVYNGEVLHFHMENGRPSPSGGRFAAKVDF 
360 370 380 390 400 410 

410 420 430 440 450 460 

GSKSVDGIIDSGDDLHMGTQKFKAAIDGNGFKGTWTENGSGDVSGKFYGPAGEEVAGKYS 

I I 111111:11 Illllllllllll 

GSKSVDGIIDSGDDLHMGTQKFKAVIDGNGFKGTWTENGGGDVSGRFYGPAGEEVAGKYS 
420 430 440 450 460 470 



470 



489 



YRPTDAEKGGFGVFAGKKEQDX 
I II II II II I II I II II I II I I 
YRPTDAEKGGFGVFAGKKEQDX 



406 



The following partial DNA sequence was identified inN. meningitidis <SEQ ID 1034>: 

m406.seq 

1 ATGCAAGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC 

51 CGCCTGCGGG ACACTGACAG GTATTCCATC GCATGGCGGA GGTAAACGCT 

101 TTGCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA 

151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC 

201 CAGTATGGGC GACCAAGGTT CAGGCAGTTT GACAGGGGGT CGCTACTCCA 

251 TTGATGCACT GATTCGTGGC GAATACATAA ACAGCCCTGC CGTCCGTACC 

301 GATTACACCT ATCCAC6TTA CGAAACCACC GCTGAAACAA CATCAGGCGG 

351 TTTGACAGGT TTAACCACTT CTTTATCTAC ACTTAATGCC CCTGCACTCT 

401 CTC6CACCCA ATCAGACGGT AGCGGAAGTA AAAGCAGTCT GGGCTTAAAT 

451 ATTGGCGGGA TGGGGGATTA TCGAAATGAA ACCTTGACGA CTAACCCGCG 

501 CGACACTGCC TTTCTTTCCC ACTTGGTACA GACCGTATTT TTCCTGCGCG 

551 GCATAGACGT TGTTTCTCCT GCCAATGCC6 ATACAGATGT GTTTATTAAC 

601 ATCGACGTAT TCGGAACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA 

651 TGCCGAAACA CTGAAAGCCC AAACAAAACT GGAATATTTC GCAGTAGACA 

701 GAACCAATAA AAAATTGCTC ATCAAACCTiA AAACCAATGC GTTTGAAGCT 

751 GCCTATAAAG AAAATTACGC ATTGTGGATG GGGCCGTATA AAGTAAGCAA 

801 AGGAATTAAA CCGACGGAAG GATTAATGGT CGATTTCTCC GATATCCGAC 

851 CATACGGCAA TCATACGGGT AACTCCGCCC CATCCGTAGA GGCTGATAAC 

901 AGTCATGAGG GGTATGGATA CAGCGATGAA GTAGTGCGAC AACATAGACA 

951 AGGACAACCT TGA 
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This corresponds to the amino acid sequence <SEQ ID 1035; ORF 406>: 

m406.pep 

1 MQflRLLIPIL FSVFILSA CG TLTGIPSHGG GKRFAVEQEL VAASARAAVK 

51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT 

101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSKSSLGLN 

151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDWSP ANADTDVFIN 

201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNKKLL IKPKTNAFEA 

251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIRPYGNHTG NSAPSVEADN 

301 SHEGYGYSDE WRQHRQGQP * 

The following partial DNA sequence was identified in N. gonorrhoeae <SEQ ID 1036>: 

g406 . seq 

1 ATGCGGGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC 

51 CGCCTGCGGG ACACTGACAG GTATTCCATC GCATGGCGGA GGCAAACGCT 

101 TCGCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA 

151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC 

201 AACTATGGGC GACCAAGGTT CAGGCAGTTT GACAGGGGGT CGCTACTCCA 

251 TTGATGCACT GATTCGCGGC GAATACATAA ACAGCCCTGC CGTCCGCACC 

3 01 GATTACACCT ATCCGCGTTA CGAAACCACC GCTGAAACAA CATCAGGCGG 

351 TTTGACGGGT TTAACCACTT CTTTATCTAC ACTTAATGCC CCTGCACTCT 

401 CGCGCACCCA ATCAGACGGT AGCGGAAGTA GGAGCAGTCT GGGCTTAAAT 

451 ATTGGCGGGA TGGGGGATTA TCGAAATGAA ACCTTGACGA CCAACCCGCG 

501 CGACACTGCC TTTCTTTCCC ACTTGGTGCA GACCGTATTT TTCCTGCGCG 

551 GCATAGACGT TGTTTCTCCT GCCAATGCCG ATACAGATGT GTTTATTAAC 

601 ATCGACX3TAT TCG6AACGAT ACGCAACAGA ACCGAAATGC ACCTATACAA 

651 TGCCGAAACA CTGAAAGCCC AAACAAAACT GGAATATTTC GCAGTAGACA 

701 GAACCAATAA AAAATTGCTC ATCAAACCCA AAACCAATGC GTTTGAAGCT 

751 GCCTATAAAG AAAATTACGC ATTGTGGATG GGGCCGTATA AAGTAAGCAA 

801 AGGAATCAAA CCGACGGAAG GATTGATGGT CGATTTCTCC GATATCCAAC 

851 CATACGGCAA TCATACGGGT AACTCCGCCC CATC03TAGA GGCTGATAAC 

901 AGTCATGAGG GGTATGGATA CAGCGATGAA GCA6TGC6AC AACATAGACA 

951 AGGGCAACCT TGA 

This corresponds to the amino acid sequence <SEQ ID 1037; ORF 406. ng>: 

g406 .pep 

1 MRARLLIPIL FSVFILSA CG TLTGIPSHGG GKRFAVEQEL VAASARAAVK 

51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT 

101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSRSSLGLN 

151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDWSP ANADTDVFIN 

201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNKKLL IKPKTNAFEA 

251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIQPYGNHTG NSAPSVEADN 

301 SHEGYGYSDE AVRQHRQGQP * 



ORF 406.ng shows 98.8% identity over a 320 aa overlap with, a predicted ORF (ORF406.a) 
from N. gonorrhoeae: 

g406/m406 

10 20 30 40 50 60 

g406 .pep MRARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR 

hllllllllllMIIIIIIIIIIIMIIIMIIIIMIIIIIIIIIIMMIIIIIMI 

m405 MQARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR 

10 20 30 40 50 60 



g406 .pep 



70 80 90 100 110 120 

KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG 
IIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIII 
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130 140 150 160 170 180 

LTTSLSTLNAPALSRTQSDGSGSRSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 
IIIMIIIIIIIIIIIIIIIIIhllllllllllllllllllllllllllllllllllll 
LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 

130 140 150 160 170 180 

190 200 210 220 230 240 

FLRGIDWSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTMKKLL 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
FLRGIDWSPANADTDVFINIDVFGTIHNRTEMHLYNAETLKAQTKLEYPAVDRTNKKLL 

190 200 . 210 220 230 240 

250 260 270 280 290 300 

IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIQPYGNHTGNSAPSVEADN 

IMIIIMIIIIIMIIIMilillllllllllllMllllhlllllllllllllllll 

IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIRPYGNHTGNSAPSVEADN 
250 260 270 280 290 300 

310 320 
SHEGYGYSDEAVRQHRQGQPX 

Illllllllhllllllllll 

SHEGYGYSDEWRQHRQGQPX 
310 320 



identified inN. meningitidis <SEQ ID 1038>: 



The following partial DNA sequence was 

a406. seq 

1 ATGCAAGCAC GGCTGCTGAT 

51 CGCCTGCGGG ACACTGACAG 

101 TCGCGGTCGA ACAAGAACTT 

151 GACATGGATT TACAGGCATT 

201 AACTATGGGC GACCAAGGTT 

251 TTGATGCACT GATTCGTGGC 

301 GATTACACCT ATCCACGTTA 

351 TTTGACAGGT TTAACCACTT 

401 CGCGCACCCA ATCAGACGGT 

451 ATTGGCGGGA TGGGGGATTA 

501 CGACACTGCC TTTCTTTCCC 

551 GCATAGACGT TGTTTCTCCT 

601 ATCGACGTAT TCGGAACGAT 

651 TGCCGAAACA CTGAAAGCCC 

7 01 GAACCAATAA AAAATTGCTC 

7 51 GCCTATAAAG AAAATTACGC 

8 01 AGGAATTAAA CCGACAGAAG 
851 CATACGGCAA TCATATGGGT 
901 AGTCATGAGG GGTATGGATA 
951 AGGGCAACCT TGA 

This corresponds to the amino acid sequence <SEQ ID 1039; ORF 406.a>: 

a405.pep 

1 MQARLLIPIL FSVFILSA CG TLTGIPSHGG GKRFAVEQEL VAASARAAVK 

51 DMDLQALHGR KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT 

101 DYTYPRYETT AETTSGGLTG LTTSLSTLNA PALSRTQSDG SGSKSSLGLN 

151 IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF FLRGIDWSP ANADTDVFIN 

201 IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNBCKLL IKPKTNAFEA 

251 AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIQPYGNHMG NSAPSVEADN 

301 SHEGYGYSDE AVRRHRQGQP * 



ACCTATTCTT 
GTATTCCATC 
GTGGCCGCTT 
ACACGGACGA 
CAGGCAGTTT 
GAATACATAA 
CGAAACCACC 
CTTTATCTAC 
AGCGGAAGTA 
TCGAAATGAA 
ACTTGGTACA 
GCCAATGCCG 
ACGCAACAGA 
AAACAAAACT 
ATCAAACCAA 
ATTGTGGATG 
GATTAATGGT 
AACTCTGCCC 
CAGCGATGAA 



TTTTCAGTTT 
GCATGGCGGA 
CTGCCAGAGC 
AAAGTTGCAT 
GACAGGGGGT 
ACAGCCCTGC 
GCTGAAACAA 
ACTTAATGCC 
AAAGCAGTCT 
ACCTTGACGA 
GACCGTATTT 
ATACGGATGT 
ACCGAAATGC 
GGAATATTTC 
AAACCAATGC 
GGACCGTATA 
CGATTTCTCC 



GCAGTGCGAC 



TTATTTTATC 
GGTAAACGCT 
TGCCGTTAAA 
TGTACATTGC 
CGCTACTCCA 
CGTCCGTACC 
CATCAGGCGG 
CCTGCACTCT 
GGGCTTAAAT 
CTAACCCGCG 
TTCCTGCGCG 
GTTTATTAAC 
ACCTATACAA 
GCAGTAGACA 
GTTTGAAGCT 
AAGTAAGCAA 
GATATCCAAC 
GGCTGATAAC 
GACATAGACA 
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m406/a406 ORFs 406 and 406. a showed a 98.8% identity in 320 aa overlap 

10 20 30 40 50 60 

m406.pep MQARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR 
I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I 
a406 MQARLLIPILFSVFILSACGTLTGIPSHGGGKRFAVEQELVAASARAAVKDMDLQALHGR 

10 20 30 40 50 60 

70 80 90 100 110 120 

m406.pep KVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG 

I I I I I I I I I I I I I I I I I I I I I I Illlll I Illll 

a406 ICVALYIATMGDQGSGSLTGGRYSIDALIRGEYINSPAVRTDYTYPRYETTAETTSGGLTG 
70 80 90 100 110 120 

130 140- 150 160 170 180 

m4 0 6 . pep LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 

Ill Illlllllllllllllllllll nil 

a406 LTTSLSTLNAPALSRTQSDGSGSKSSLGLNIGGMGDYRNETLTTNPRDTAFLSHLVQTVF 

130 140 150 160 170 180 

190 200 210 220 230 240 

m4 0 6 . pep FLRG I DVV3 PANADT DVFINI DVFGT IRNRTEMHLYNAET LKAQTKLE Y FAVDRTNKKLL 

I II I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
a406 FLRGIDWSPANADTDVFINIDVFGTIRNRTEMHLYNAETLKAQTKLEYFAVDRTNKKLL 

190 200 210 220 230 240 

250 260 270 280 290 300 

m4 0 6 . pep IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIRPYGNHTGNSAPSVEADN 

Hill Illll illllll 111111111:11111 lllllllllll 

a406 IKPKTNAFEAAYKENYALWMGPYKVSKGIKPTEGLMVDFSDIQPYGNHMGNSAPSVEADN 

250 260 270 280 290 300 

310 320 
m406.pep SHEGYGYSDEVVRQHRQGQPX 

a406 SHEGYGYSDEAVRRHRQGQPX 
310 320 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 1040>: 

m726.saq 

1 ATGACCATCT ATTTCAAAAA CGGCTTTTAC GACGACACAT TGGGCGGCAT 

51 CCCCGAAGGC GCGGTTGCCG TCCGCGCCGA AGAATACGCC GCCCTTTTGG 

101 CAGGACAGGC GCAGGGCGGG CAGATTGCCG CAGATTCCGA CGGCCGCCCC 

151 GTTTTAACCC CGCCGCGCCC GTCCGATTAC CACGAATGGG ACGGCAAAAA 

201 ATGGAAAATC AGCAAAGCCG CCGCCGCCGC CCGTTTCGCC AAACAAAAAA 

251 CCGCCTTGGC ATTCCGCCTC GCGGAAAAGG CGGACGAACT CAAAAACAGC 

301 CTCTTGGCGG GCTATCCCCA AGTGGAAATC GACAGCTTTT ACAGGCAGGA 

351 AAAAGAAGCC CTCGCGCGGC AGGCGGACAA CAACGCCCCG ACCCCGATGC 

4 01 TGGCGCAAAT CGCCGCCGCA AGGGGCGTGG AATTGGACGT TTTGATTGAA 

4 51 AAAGTTATCG AAAAATCCGC CCGCCTGGCT GTTGCCGCCG GCGCGATTAT 

501 CGGAAAGCGT CAGCAGCTCG AAGACAAATT GAACACCATC GAAACCGCGC 

551 CCGGATTGGA CGCGCTGGAA AAGGAAATCG AAGAATGGAC GCTAAACATC 

601 GGCTGA 

This corresponds to the amino acid sequence <SEQ ID 1041; ORF 726>: 

m726.pep 

1 MTIYFKNGFY DDTLGGIPEG AVAVRAEEYA ALLAGQAQGG QIAADSDGRP 

51 VLTPPRPSDY HEWDGKKWKI SKAAAAARFA KQKTALAFRL AEKADELKNS 
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101 LLAGYPQVEI DSFYRQEKEA LARQADNNAP TPMLAQIAAA RGVELDVLIE 
151 KVIEKSARLA VAAGAIIGKR QQLEDKLNTI ETAPGLDALE KEIEEWTLNI 
201 G* 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 1042>: 

m907-2.seq 

1 ATGAGAAAAC CGACCGATAC CCTACCCGTT AATCTGCAAC GCCGCCGCCT 

51 GTTGTGTGCC GCCGGTGCGT TGTTGCTCAG TCCTCTGGCG CACGCCGGCG 

101 CGCAACGTGA GGAAACGCTT GCCGACGATG TGGCTTCCGT GATGAGGAGT 

151 TCTGTCGGCA GCGTCAATCC GCCGAGGCTG GTGTTTGACA ATCCGAAAGA 

201 GGGCGAGCGT TGGTTGTCTG CCATGTCGGC ACGTTTGGCA AGGTTCGTCC 

251 CCGAGGAGGA GGAGCGGCGC AGGCTGCTGG TCAATATCCA GTACGAAAGC 

301 AGCCGGGCCG GTTTGGATAC GCAGATTGTG TTGGGGCTGA TTGAGGTGGA 

351 AAGCGCGTTC CGCCAGTATG CAATCAGCGG TGTCGGCGCG CGCGGCCTGA 

4 01 TGCAGGTTAT GCCGTTTTGG AAAAACTACA TCGGCAAACC GGCGCACAAC 

4 51 CTGTTCGACA TCCGCACCAA CCTGCGTTAC GGCTGTACCA TCCTGCGCCA 

501 TTACCGGAAT CTTGAAAAAG GCAACATCGT CCGCGCGCTT GCCCGCTTTA 

551 ACGGCAGCTT GGGCAGCAAT AAATATCCGA ACGCCGTTTT GGGCGCGTGG 

601 CGCAACCGCT GGCAGTGGCG TTGA 

This corresponds to the amino acid sequence <SEQ ID 1043; ORF 907-2>: 

m907-2 .pep 

1 MRKPTDTLPV NLQRRRLLCA AGALLLSPLA HAGAQREETL ADDVASVMRS 
51 SVGSVNPPRL VFDNPKEGER WLSAMSARLA RFVPEEEERR RLLVNIQYES 
101 SRAGLDTQIV LGLIEVESAF RQYAISGVGA RGLMQVMPFW KNYIGKPAHN 
151 LFDIRTNLRY GCTILRHYRN LEKGNIVRAL ARFNGSLGSN KYPNAVLGAW 
201 RNRWQWR* 



The following partial DNA sequence was identified inN. meningitidis <SEQ ID 1044>: 

m953.seq 

1 ATGAAAAAAA TCATCTTCGC CGCACTCGCA GCCGCCGCCA TCAGTACTGC 

51 CTCCGCCGCC ACCTACAAAG TGGACGAATA TCACGCCAAC GCCCGTTTCG 

101 CCATCGACCA TTTCAACACC AGCACCAACG TCGGCGGTTT TTACGGTCTG 

151 ACCGGTTCCG TCGAGTTCGA CCAAGCAAAA CGCGACGGTA AAATCGACAT 

201 CACCATCCCC ATTGCCAACC TGCAAAGCGG TTCGCAACAC TTTACCGACC 

251 ACCTGAAATC AGCCGACATC TTCGATGCCG CCCAATATCC GGACATCCGC 

301 TTTGTTTCCA CCAAATTCAA CTTCAACGGC AAAAAACTGG TTTCCGTTGA 

351 CGGCAACCTG ACCATGCACG GCAAAACCGC CCCCGTCAAA CTCAAAGCCG 

401 AftAAATTCAA CTGCTACCAA AGCCCGATGG AGAAAACCGA AGTTTGTGGC 

451 GGCGACTTCA GCACCACCAT CGACCGCACC AAATGGGGCA TGGACTACCT 

501 CGTTAACGTT GGTATGACCA AAAGCGTCCG CATCGACATC CAAATCGAGG 

551 CAGCCAAACA ATAA 



This corresponds to the amino acid sequence <SEQ ID 1045; ORF 953>: 

m953.pep 

1 MKKIIFAALA AAAISTASAA TYKVDEYHAN ARFAIDHFNT STNVGGFYGL 

51 TGSVEFDQAK RDGKIDITIP lANLQSGSQH FTDHLKSADI FDAAQYPDIR 

101 FVSTKFNFNG KKLVSVDGNL TMHGKTAPVK LKAEKFNCYQ SPMEKTEVCG 

151 GDFSTTIDRT KWGMDYLVNV GMTKSVRIDI QIEAAKQ* 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 1046>: 
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1 ATGAAAACAA CCGACAAACG GACAACCGAA ACACACCGCA AAGCCCCGAA 

51 AACCGGCC3C ATCCGCTTCT CGCCTGCTTA CTTAGCCATA TGCCTGTCGT 

101 TCGGCATTCT TCCCCAAGCC TGGGCGGGAC ACACTTATTT CGGCATCAAC 

151 TACCAATACT ATCGCGACTT TGCCGAAAAT AAAGGCAAGT TTGCAGTCGG 

201 GGCGAAAGAT ATTGAGGTTT ACAACAAAAA AGGGGAGTTG GTCGGCAAAT 

251 CAATGACAAA AGCCCCGATG ATTGATTTTT CTGTGGTGTC GCGTAACGGC 

301 GTGGCGGCAT TGGTGGGCGA TCAATATATT GTGAGCGTGG CACATAACGG 

351 CGGCTATAAC AACGTTGATT TTGGTGCGGA AGGAAGAAAT CCCGATCAAC 

401 ATCGTTTTAC TTATAAAATT GTGAAACGGA ATAATTATAA AGCAGGGACT 

451 AAAGGCCATC CTTATGGCGG CGATTATGRT ATGCCGCGTT TGCATAAATT 

501 TGTCACAGAT GCAGAACCTG TTGAAATGAC CAGTTATATG GATGGGCGGA 

551 AATATATCGA TCAAAATAAT TACCCTGACC GTGTTCGTAT TGGGGCAGGC 

601 AGGCAATATT GGCGATCTGA TGAAGATGAG CCCAATAACC GCGAAAGTTC 

651 ATATCATATT GCAAGTGCGT ATTCTTGGCT CGTTGGTGGC AATACCTTTG 

701 CACAAAATGG ATCAGGTGGT GGCACAGTCA ACTTAGGTAG TGAAAAAATT 

751 AAACATAGCC CATATGGTTT TTTACCAACA GGAGGCTCAT TTGGCGACAG 

801 TGGCTCACCA ATGTTTATCT ATGATGCCCA AAAGCAAAAG TGGTTAATTA 

851 ATGGGGTATT GCAAACGGGC AACCCCTATA TAGGAAAAAG CAATGGCTTC 

901 CAGCTGGTTC GTAAAGATTG GTTCTATGAT GAAATCTTTG CTGGAGATAC 

951 CCATTCAGTA TTCTACGAAC CACGTCAAAA TGGGAAATAC TCTTTTAACG 

1001 ACGATAATAA TGGCACAGGA AAAATCAATG CCAAACATGA ACACAATTCT 

1051 CTGCCTAATA GATTAAAAAC ACGAACCGTT CAATTGTTTA ATGTTTCTTT 

1101 ATCCGAGACA GCAAGAGAAC CTGTTTATCA TGCTGCAGGT GGTGTCAACA 

1151 GTTATCGACC CAGACTGAAT AATGGAGAAA ATATTTCCTT TATTGACGAA 

1201 GGAAAAGGCG AATTGATACT TACCAGCAAC ATCAATCAAG GTGCTGGAGG 

1251 ATTATATTTC CAAGGAGATT TTACGGTCTC GCCTGAAAAT AACGAAACTT 

1301 GGCAAGGCGC GGGCGTTCAT ATCAGTGAAG ACAGTACCGT TACTTGGAAA 

1351 GTAAACGGCG TGGCAAACGA CCGCCTGTCC AAAATCGGCA AAGGCACGCT 

1401 GCACGTTCAA GCCAAAGGGG AAAACCAAGG CTCGATCAGC GTGGGCGACG 

1451 GTACAGTCAT TTTGGATCAG CAGGCAGACG ATAAAGGCAA AAAACAAGCC 

1501 TTTAGTGAAA TCGGCTTGGT CAGCGGCAGG GGTACGGTGC AACTGAATGC 

1551 CGATAATCAG TTCAACCCCG ACAAACTCTA TTTCGGCTTT CGCGGCGGAC 

1601 GTTTGGATTT AAACGGGCAT TCGCTTTCGT TCCACCGTAT TCAAAATACC 

1651 GATGAAGGGG CGATGATTGT CAACCACAAT CAAGACAAAG AATCCACCGT 

1701 TACCATTACA GGCAATAAAG ATATTGCTAC AACCGGCAAT AACAACAGCT 

1751 TGGATAGCAA AAAAGAAATT GCCTACAACG GTTGGTTTGG CGAGAAAGAT 

1801 ACGACCAAAA CGAACGGGCG GCTCAACCTT GTTTACCAGC CCGCCGCAGA 

1851 AGACCGCACC CTGCTGCTTT CCGGCGGAAC AAATTTAAAC GGCAACATCA 

1901 CGCAAACAAA CGGCAAACTG TTTTTCAGCG GCAGACCAAC ACCGCACGCC 

1951 TACAATCATT TAAACGACCA TTGGTCGCAA AAAGAGGGCA TTCCTCGCGG 

2001 GGAAATCGTG TGGGACAACG ACTGGATCAA CCGCACATTT AAAGCGGAAA 

2051 ACTTCCAAAT TAAAGGCGGA CAGGCGGTGG TTTCCCGCAA TGTTGCCAAA 

2101 GTGAAAGGCG ATTGGCATTT GAGCAATCAC GCCCAAGCAG TTTTTGGTGT 

2151 CGCACCGCAT CAAAGCCACA CAATCTGTAC ACGTTCGGAC TGGACGGGTC 

2201 TGACAAATTG TGTCGAAAAA ACCATTACCG ACGATAAAGT GATTGCTTCA 

2251 TTGACTAAGA CCGACATCAG CGGCAATGTC GATCTTGCCG ATCACGCTCA 

2301 TTTAAATCTC ACAGGGCTTG CCACACTCAA CGGCAATCTT AGTGCAAATG 

2351 GCGATACACG TTATACAGTC AGCCACAACG CCACCCAAAA CGGCAACCTT 

24 01 AGCCTCGTGG GCAATGCCCA AGCAACATTT AATCAAGCCA CATTAAACGG 

24 51 CAACACATCG GCTTCGGGCA ATGCTTCATT TAATCTAAGC GACCACGCCG 

2501 TACAAAACGG CAGTCTGACG CTTTCCGGCA ACGCTAAGGC AAACGTAAGC 

2551 CATTCCGCRC TCAACGGTAA TGTCTCCCTA GCCGATAAGG CAGTATTCCA 

2601 TTTTGAAAGC AGCCGCTTTA CCGGACAAAT CAGCGGCGGC AAGGATACGG 

2651 CATTACACTT AAAAGACAGC GAATGGACGC TGCCGTCAGG CACGGAATTA 

2701 GGCAATTTAA ACCTTGACAA CGCCACCATT ACACTCAATT CCGCCTATCG 

2751 CCACGATGCG GCAGGGGCGC AAACCGGCAG TGCGACAGAT GCGCCGCGCC 

2801 GCCGTTCGCG CCGTTCGCGC CGTTCCCTAT TATCCGTTAC ACCGCCAACT 

2851 TCGGTAGAAT CCCGTTTCAA CACGCTGACG GTAAACGGCA AATTGAACGG 

2901 TCAGGGAACA TTCCGCTTTA TGTCGGAACT CTTCGGCTAC CGCAGCGACA 

2951 AATTGAAGCT GGCGGAAAGT TCCGAAGGCA CTTACACCTT GGCGGTCAAC 

3001 AATACCGGCA ACGAACCTGC AAGCCTCGAA CAATTGACGG TAGTGGAAGG 

3051 AAAAGACAAC AAACCGCTGT CCGAAAACCT TAATTTCACC CTGCAAAACG 

3101 AACACGTCGA TGCCGGCGCG TGGCGTTACC AACTCATCCG CAAAGACGGC 
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3151 GAGTTCCGCC TGCATAATCC GGTCAflAGAA CAAGAGCTTT CCGACAAACT 

3201 CGGCAAGGCA GAAGCCAAAA AACAGGCGGA AAAAGACAAC GCGCAAAGCC 

3251 TTGACGCGCT GATTGCGGCC GGGCGCGATG CCGTCGAAAA GACAGAAAGC 

3301 GTTGCCGAAC CGGCCCGGCA GGCAGGCGGG GAAAATGTCG GCATTATGCA 

3351 GGCGGAGGAA GAGAAAAAAC GGGTGCAGGC GGATAAAGAC ACCGCCTTGG 

3401 CGAAACAGCG CGAAGCGGAA ACCCGGCCGG CTACCACCGC CTTCCCCCGC 

3451 GCCCGCCGCG CCCGCCGGGA TTTGCCGCAA CTGCAACCCC AACCGCAGCC 

3501 CCAACCGCAG CGCGACCTGA TCAGCCGTTA TGCCAATAGC GGTTTGAGTG 

3551 AATTTTCCGC CACGCTCAAC AGCGTTTTCG CCGTACAGGA CGAATTAGAC 

3601 CGCGTATTTG CCGAAGACCG CCGCAACGCC GTTTGGACAA GCGGCATCCG 

3651 GGACACCAAR CACTACCGTT CGCAAGATTT CCGCGCCTAC CGCCAACAAA 

3701 CCGACCTGCG CCAAATCGGT ATGCAGAAAA ACCTCGGCAG CGGGCGCGTC 

3751 GGCATCCTGT TTTCGCACAA CCGGACCGAA AACACCTTCG ACGACGGCAT 

3801 CGGCAACTCG GCACGGCTTG CCCACGGCGC CGTTTTCGGG CAATACGGCA 

3851 TCGACAGGTT CTACATCGGC ATCAGCGCGG GCGCGGGTTT TAGCAGCGGC 

3901 AGCCTTTCAG ACGGCATCGG AGGCAAAATC CGCCGCCGCG TGCTGCATTA 

3951 CGGCATTCAG GCACGATACC GCGCGGGTTT CGGCGGATTC GGCATCGAAC 

4001 CGCACATCGG CGCAACGCGC TATTTCGTCC AAAAAGCGGA TTACCGCTAC 

4 051 GAAAACGTCA ATATCGCCAC CCCCGGCCTT GCATTCAACC GCTACCGCGC 

4101 GGGCATTAAG GCAGATTATT CATTCAAACC GGCGCAACAC ATTTCCATCA 

4151 CGCCTTATTT GAGCCTGTCC TATACCGATG CCGCTTCGGG CAAAGTCCGA 

4201 ACACGCGTCA ATACCGCCGT ATTGGCTCAG GATTTCGGCA AAACCCGCAG 

4251 TGCGGAATGG GGCGTAAACG CCGAAATCAA AGGTTTCACG CTGTCCCTCC 

4 301 ACGCTGCCGC CGCCAAAGGC CCGCAACTGG AAGCGCAACA CAGCGCGGGC 

4 351 ATCAAATTAG GCTACCGCTG GTAA 



This corresponds to the amino acid sequence <SEQ ID 1047; ORF orfl-l>: 

orfl-l -pep 

1 MKTTDKRTTE THRKAPKTGR IRFSPAYLAI CLSFGILPQA WAGHTYFGIN 

51 YQYYRDFAEN KGKFAVGAKD lEVYNKKGEL VGKSMTKAPM IDFSWSRNG 

101 VAALVGDQYI VSVMNGGYN NVDFGAEGRN PDQHRFTYKI VKRNNYKAGT 

151 KGHPYGGDYH MPRLHKFVTD AEPVEMTSYM DGRKYIDQNN YPDRVRIGAG 

201 RQYWRSDEDE PNNRESSYHI ASAYSWLVGG NTFAQNGSGG GTVNLGSEKI 

251 KHSPYGFLPT GGSFGDSGSP MFIYDAQKQK WLINGVLQTG NPYIGKSNGF 

301 QLVRKDWFYD EIFAGDTHSV FYEPRQNGKY SFNDDNNGTG KINAKHEHNS 

351 LPNRLKTRTV QLFNVSLSET AREPVYHAAG GVNSYRPRLN NGENISFIDE 

401 GKGELILTSN INQGAGGLYF QGDFTVSPEN NETWQGAGVH ISEDSTVTWK 

451 VNGVANDRLS KIGKGTLHVQ AKGENQGSIS VGDGTVILDQ QADDKGKKQA 

501 FSEIGLVSGR GTVQLNADNQ FNPDKLYFGF RGGRLDLNGH SLSFHRIQNT 

551 DEGAMIVNHN QDKESTVTIT GNKDIATTGN NNSLDSKKEI AYNGWFGEKD 

601 TTKTNGRLNL VYQPAAEDRT LLLSGGTNLN GNITQTNGKL FFSGRPTPHA 

651 YNHLNDHWSQ KEGIPRGEIV WDNDWINRTF KAENFQIKGG QAWSRNVAK 

701 VKGDWHLSNH AQAVFGVAPH QSHTICTRSD WTGLTNCVEK TITDDKVIAS 

751 LTKTDISGNV DLADHAHLNL TGLATLNGNL SANGDTRYTV SHNATQNGNL 

801 SLVGNAQATF NQATLNGNTS ASGNASFNLS DHAVQNGSLT LSGNAKANVS 

851 HSALNGNVSL ADfCAVFHFES SRFTGQISGG KDTALHLKDS EWTLPSGTEL 

901 GNLNLDNATI TLNSAYRHDA AGAQTGSATD APRRRSRRSR RSLLSVTPPT 

951 SVESRFNTLT VNGKLNGQGT FRFMSELFGY RSDKLKLAES SEGTYTLAVN 

1001 NTGNEPASLE QLTWEGKDN KPLSENLNFT LQNEHVDAGA WRYQLIRKDG 

1051 EFRLHNPVKE QELSDKLGKA EAKKQAEKDN AQSLDALIAA GRDAVEKTES 

1101 VAEPARQAGG ENVGIMQAEE EKKRVQADKD TALAKQREAE TRPATTAFPR 

1151 ARRARRDLPQ LQPQPQPQPQ RDLISRYANS GLSEFSATLN SVFAVQDELD 

1201 RVFAEDRRNA VWTSGIRDTK HYRSQDFRAY RQQTDLRQIG MQKNLGSGRV 

1251 GILFSHNRTE NTFDDGIGNS ARLAHGAVFG QYGIDRFYIG ISAGAGFSSG 

1301 SLSDGIGGKI RRRVLHYGIQ ARYRAGFGGF GIEPHIGATR YFVQKADYRY 

1351 ENVNIATPGL AFNRYRAGIK ADYSFKPAQH ISITPYLSLS YTDAASGKVR 

1401 TRVNTAVLAQ DFGKTRSAEW GVNAEIKGFT LSLHTiAAAKG PQLEAQHSAG 

1451 IKLGYRW* 
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The following partial DNA sequence was identified in N. meningitidis <SEQ ID 1048>: 

orf 46-2 . seq 

1 TTGGGCATTT CCCGCAAAAT ATCCCTTATT CTGTCCATAC TGGCAGTGTG 

51 CCTGCCGATG CATGCACACG CCTCAGATTT GGCAAACGAT TCTTTTATCC 

101 GGCAGGTTCT CGACCGTCAG CATTTCGAAC CCGACGGGAA ATACCACCTA 

151 TTCGGCAGCA GGGGGGAACT TGCCGAGCGC AGCGGCCATA TCGGATTGGG 

201 AAAAATACAA AGCCATCAGT TGGGCAACCT GATGATTCAA CAGGCGGCCA 

251 TTAAAGGAAA TATCGGCTAC ATTGTCCGCT TTTCCGATCA CGGGCACGAA 

301 GTCCATTCCC CCTTCGACAA CCATGCCTCA CATTCCGATT CTGATGAAGC 

351 CGGTAGTCCC GTTGACGGAT TTAGCCTTTA CCGCATCCAT TGGGACGGAT 

401 ACGAACACCA TCCCGCCGAC GGCTATGACG GGCCACAGGG CGGCGGCTAT 

4 51 CCCGCTCCCA AAGGCGCGAG GGATATATAC AGCTACGACA TAAAAGGCGT 

501 TGCCCAAAAT ATCCGCCTCA ACCTGACCGA CAACCGCAGC ACCGGACAAC 

551 GGCTTGCCGA CCGTTTCCAC AATGCCGGTA GTATGCTGAC GCAAGGAGTA 

601 GGCGACGGAT TCAAACGCGC CACCCGATAC AGCCCCGAGC TGGACAGATC 

651 GGGCAATGCC GCCGAAGCCT TCAACGGCAC TGCAGATATC GTTAAAAACA 

7 01 TCATCGGCGC GGCAGGAGAA ATTGTCGGCG CAGGCGATGC CGTGCAGGGC 

7 51 ATAAGCGAAG GCTCAAACAT TGCTGTCATG CACGGCTTGG GTCTGCTTTC 

801 CACCGAAAAC AAGATGGCGC GCATCAACGA TTTGGCAGAT ATGGCGCAAC 

851 TCAAAGACTA TGCCGCAGCA GCCATCCGCG ATTGGGCAGT CCAAAACCCC 

901 AATGCCGCAC AAGGCATAGA AGCCGTCAGC AATATCTTTA TGGCAGCCAT 

951 CCCCATCAAA GGGATTGGAG CTGTTCGGGG AAAATACGGC TTGGGCGGCA 

1001 TCACGGCACA TCCTATCAAG CGGTCGCAGA TGGGCGCGAT CGCATTGCCG 

1051 AAAGGGAAAT CCGCCGTCAG CGACAATTTT GCCGATGCGG CATACGCCAA 

1101 ATACCCGTCC CCTTACCATT CCCGAAATAT CCGTTCAAAC TTGGAGCAGC 

1151 GTTACGGCAA AGAAAACATC ACCTCCTCAA CCGTGCCGCC GTCAAACGGC 

1201 AAAAATGTCA AACTGGCAGA CCAACGCCAC CCGAAGACAG GCGTACCGTT 

1251 TGACGGTAAA GGGTTTCCGA ATTTTGAGAA GCACGTGAAA TATGATACGA 

1301 AGCTCGATAT TCAAGAATTA TCGGGGGGCG GTATACCTAA GGCTAAGCCT 

1351 GTGTTTGATG CGAAACCGAG ATGGGAGGTT GATAGGAAGC TTAATAAATT 

14 01 GACAACTCGT GAGCAGGTGG AGAAAAATGT TCAGGAAATA AGGAACGGTA 

1451 ATATAAACAG TAACTTTAGC CAACATGCTC AACTAGAGAG GGAAATTAAT 

1501 AAACTAAAAT CTGCCGATGA AATTAATTTT GCAGATGGAA TGGGAAAATT 

1551 TACCGATAGC ATGAATGACA AGGCTTTTAG TAGGCTTGTG AAATCAGTTA 

1601 AAGAGAATGG CTTCACAAAT CCAGTTGTGG AGTACGTTGA AATAAATGGA 

1651 AAAGCATATA TCGTAAGAGG AAATAATRGG GTTTTTGCTG CAGAATACCT 

1701 TGGCAGGATA CATGAATTAA AATTTAAAAA AGTTGACTTT CCTGTTCCTA 

1751 ATACTAGTTG GAAAAATCCT ACTGATGTCT TGAATGAATC AGGTAATGTT 

1801 AAGAGACCTC GTTATAGGAG TAAATAA 



This corresponds to the amino acid sequence <SEQ ID 1049; ORF orf46-2>: 

orf 4 6-2 .pep 

1 LGISRKISLI LSILAVCLPM HAHASDLAND SFIRQVLDRQ HFEPDGKYHL 

51 FGSRGELAER SGHIGLGKIQ SHQLGNLMIQ QAAIKGNIGY IVRFSDHGHE 

101 VHSPFDNHAS HSDSDEAGSP VDGFSLYRIH WDGYEHHPAD GYDGPQGGGY 

151 PAPKGARDIY SYDIKGVAQN IRLNLTDNRS TGQRLADRFH NAGSMLTQGV 

201 GDGFKRATRY SPELDRSGNA AEAFNGTADI VKNIIGAAGE IVGAGDAVQG 

251 ISEGSNIAVM HGLGLLSTEN KMARINDLAD MAQLKDYAAA AIRDWAVQNP 

301 NAAOGIEAVS NIFMAAIPIK GIGAVRGKYG LGGITAHPIK RSC3MGAIALP 

351 KGKSAVSDNF ADAAYAKYPS PYHSRNIRSN LEQRYGKENI TSSTVPPSNG 

401 KNVKLADQRH PKTGVPFDGK GFPNFEKHVK YDTKLDIQEL SGGGIPKAKP 

451 VFDAKPRWEV DRKLNKLTTR EQVEKNVQEI RNGNINSNFS QHAQLEREIN 

501 KLKSADEINF ADGMGKFTDS MNDKAFSRLV KSVKENGFTN PWEYVEING 

551 KAYIVRGNNR VFAAEYLGRI HELKFKKVDF PVPNTSWKNP TDVLNESGNV 

601 KRPRYRSK* 
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Using the above-described procedixres, the following oligonucleotide primers were 
employed in the polymerase chain reaction (PGR) assay in order to clone the ORPs as 
indicated: 
5 Oligonucleotides used for PGR 

Table 1 



ORF 


Primer 


Sequence 


Restriction sites 


279 


Forward 
Reverse 


CGCGGATCCCATATG-TTGCCTGCAATCACGATT 
<SEQ ID 1050> 

CCCGCTCGAG-TTTAGAAGCGGGCGGCAA <SEQ 
ID 1051> 


BamHI-Ndel 
Xhol 


519 


Forward 
Reverse 


CGCGGATCCCATATG-TTCAAATCCTTTGTCGTCA 
<SEQID 1052> 

CCCGGTCGAG-TTTGGCGGTTTTGCTGG <SEQ ID 
1053> 


BamHI-Ndel 
Xhol 


576 


Forward 
Reverse 


CGCGGATCGCATATG-GCCGCCCCCGCATCT 
<SEQ ID 1054> 

CCCGCTCGAG-ATTTACI 1 1 1 1 IGATGTGGAC 
<SEQ ID 1055> 


BamHI-Ndel 
Xhol 


919 


Forward 
Reverse 


CGCGGATCCCATATG-TGCGAAAGGAAGAGCATC 
<SEQ ID 1056> 

CGCGCTCGAG-CGGGCGGTATTCGGG <SEQ ID 
1057> 


BamHI-Ndel 
Xhol 


121 


Forward 
Reverse 


CGCGGATCCCATATG-GAAACACAGCTTTACAT 
<SEQ ID 1058> 

CCCGCTCGAG-ATAATAATATCCCGCGCCC<SEQ 
ID 1059> 


BamHI-Ndel 
Xhol 


128 


Forward 
Reverse 


CGCGGATCGCATATG-AGTGAGAAGGCAGT <SEQ 
ID 1060> 

CCCGCTCGAG-GACCGCGTTGTCGAAA <SEQ ID 
1061> 


BamHI-Ndel 
Xhol 


206 


Forward 
Reverse 


CGCGGATCGCATATG-AAACAGGGCCAACGGA 
<SEQ ID 1062> 

CCCGCTGGAG-TTCTGTAAAAAAAGTATGTGC 
<SEQ ID 1063> 


BamHI-Ndel 
Xhol 


287 


Forward 
Reverse 


CCGGAATTCTAGCTAGC-CTTTCAGCGTGGGGG 
<SEQ ID 1064> 

GGGGGTCGAG-ATGCTGCTG 1 1 1 1 1 1 GGC <SEQ ID 
1065> 


EcoRI-Nhel 
Xhol 


406 


Forward 


CGCGGATCGCATATG-TGCGGGACACTGACAG 
<SEQ ID 1066> 


BamHI-Ndel 
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CCCGCTCGAG-AGGTTGTCCTTGTCTATG <SEQ 


Xhol 


1 1 Reverse 


ID 1067> 






EXAMPLE 2 






Expression of ORF 9 1 9 





5 The primer described in Table 1 for ORF 919 was used to locate and clone ORF 919. 

The predicted gene 919 was cloned in pET vector and expressed in E. coli. The product of 
protein expression and purification was analyzed by SDS-PAGE. In panel A) is shown the 
analysis of 919-His fusion protein purification. Mice were immunized with the purified 919- 
His and sera were used for Western blot (panel B), FACS analysis (panel C), bactericidal 

1 0 assay (panel D), and ELISA assay (panel E). Symbols: Ml , molecular weight marker; PP, 
purified protein, TP, N. meningitidis total protein extract; OMV, N. meningitidis outer 
membrane vesicle preparation. Arrows indicate the position of the main recombinant protein 
product (A) and the N. meningitidis immunoreactive band (B). These experiments confirm 
that 919 is a surface-exposed protein and that it is a useful immunogen. The hydrophilicity 

15 plots, antigenic index, and amphipatic regions of ORF 919 are provided in Figure 10. The 
AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, /. Immunol 
143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al. 1992, 
Scand J Immunol Suppl 1 1 :9). The nucleic acid sequence of ORF 919 and the amino acid 
sequence encoded thereby is provided in Example 1 . 

20 

EXAMPLES 
Expression of ORF 279 
The primer described in Table 1 for ORF 279 was used to locate and clone ORF 279. 
The predicted gene 279 was cloned in pGex vector and expressed in E. coli. The product of 
25 protein expression and purification was analyzed by SDS-PAGE. In panel A) is shown the 
analysis of 279-GST purification. Mice were immunized with the purified 279-GST and sera 
were used for Western blot analysis (panel B), FACS analysis (panel C), bactericidal assay 
(panel D), and ELISA assay (panel E). Symbols: Ml, molecular weight marker; TP, N. 
meningitidis total protein extract; OMV, N. meningitidis outer membrane vescicle 
30 preparation. Arrows indicate the position of the main recombinant protein product (A) and 
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the N. meningitidis irmnunoreactive band (B). These experiments confirm that 279 is a 
surface-exposed protein and that it is a useful immunogen. The hydrophilicity plots, 
antigenic index, and amphipatic regions of ORF 279 are provided in Figure 1 1 . The AMPHI 
program is used to predict putative T-cell epitopes (Gao et al 1989, J. Immunol 143:3007; 
5 Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et aJ. 1992, ScandJ 
Immunol Suppl 1 1 :9). The nucleic acid sequence of ORF 279 and the amino acid sequence 
encoded thereby is provided in Example 1. 



10 EXAMPLE 4 

Expression of ORF 576 
The primer described in Table 1 for ORF 576 was used to locate and clone ORF 576. 
The predicted gene 576 was cloned in pGex vector and expressed in E. coli. The product of 
protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 576- 

15 GST fusion protein purification. Mice were immunized with the purified 576-GST and sera 
were used for Western blot (panel B), FACS analysis (panel C), bactericidal assay (panel D), 
and ELISA assay (panel E). Symbols: Ml, molecular weight marker; TP, N. meningitidis 
total protein extract; OMV, N. meningitidis outer membrane vescicle preparation. Arrows 
indicate the position of the main recombinant protein product (A) and the N. meningitidis 

20 immunoreactive band (B).. These experiments confirm that ORF 576 is a surface-exposed 
protein and that it is a useful immunogen. The hydrophilicity plots, antigenic index, and 
amphipatic regions of ORF 576 are provided in Figure 12. The AMPHI program is used to 
predict putative T-cell epitopes (Gao et al 1989, J. Immunol 143:3007; Roberts et al. 1996, 
AIDS Res Human Retroviruses 12:593; Quakyi et al. 1992, Scand J Immunol Suppl 11:9). 

25 The nucleic acid sequence of ORF 576 and the amino acid sequence encoded thereby is 
provided in Example 1. 



EXAMPLE 5 
Expression of ORF 519 

30 The primer described in Table 1 for ORF 5 1 9 was used to locate and clone ORF 519. 

The predicted gene 519 was cloned in pET vector and expressed in E. coli. The product of 
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protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 519- 
His fusion protein purification. Mice were immunized with the purified 519-His and sera 
were used for Western blot (panel B), FACS analysis (panel C), bactericidal assay (panel D), 
and ELISA assay (panel E). Symbols: Ml, molecular weight marker; TP, N. meningitidis 
5 total protein extract; OMV, N. meningitidis outer membrane vesicle preparation. Arrows 
indicate the position of the main recombinant protein product (A) and the N. meningitidis 
immunoreactive band (B). These experiments confirm that 519 is a surface-exposed protein 
and that it is a useful immunogen. The hydrophilicity plots, antigenic index, and amphipatic 
regions of ORF 519 are provided in Figure 13. The AMPHI program is used to predict 
10 putative T-cell epitopes (Gao et al 1989, J. Immunol 143:3007; Roberts et al. 1996, AIDS Res 
Human Retroviruses 12:593; Quakyi et al. 1992, Scand J Immunol Suppl 1 1 :9). The nucleic 
acid sequence of ORF 519 and the amino acid sequence encoded thereby is provided in 
Example 1. 



15 EXAMPLE 6 

Expression of ORF 121 
The primer described in Table 1 for ORF 121 was used to locate and clone ORF 121. 

The predicted gene 121 was cloned in pET vector and expressed in E. coli. The product of 

protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 121- 
20 His fusion protein purification. Mice were immunized with the purified 1 2 1 -His and sera 

were used for Western blot analysis (panel B), FACS analysis (panel C), bactericidal assay 

(panel D), and ELISA assay (panel E). Results show that 121 is a surface-exposed protein. 

Symbols: Ml, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N. 

meningitidis outer membrane vescicle preparation. Arrows indicate the position of the main 
25 recombinant protein product (A) and the A^. meningitidis immunoreactive band (B). These 

experiments confirm that 121 is a surface-exposed protein and that it is a useful immunogen. 

The hydrophilicity plots, antigenic index, and amphipatic regions of ORF 121 are provided in 

Figure 14. The AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, J. 

Immunol 143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al. 
30 1992, Scand J Immunol Suppl 1 1:9). The nucleic acid sequence of ORF 121 and the amino 

acid sequence encoded thereby is provided in Example 1 . 
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EXAMPLE 7 
Expression of ORE 128 
The primer described in Table 1 for ORF 128 was used to locate and clone ORE 128. 
5 The predicted gene 128 was cloned in pET vector and expressed in E. coli. The product of 
protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 128- 
His purification. Mice were immunized with the purified 128-His and sera were used for 
Western blot analysis (panel B), FACS analysis (panel C), bactericidal assay (panel D) and 
ELISA assay (panel E). Results show that 128 is a surface-exposed protein. Symbols: Ml, 
10 molecular weight marker; TP, N. meningitidis total protein extract; OMV, N. meningitidis 
outer membrane vesicle preparation. Arrows indicate the position of the main recombinant 
protein product (A) and the A'^ meningitidis immunoreactive band (B). These experiments 
confirm that 128 is a surface-exposed protein and that it is a useful immunogen. The 
hydrophilicity plots, antigenic index, and amphipatic regions of ORF 128 are provided in 
15 Figure 15. The AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, J. 
Immunol 143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al. 
1992, Scand J Immunol Suppl 1 1:9). The nucleic acid sequence of ORF 128 and the amino 
acid sequence encoded thereby is provided in Example 1 . 

20 EXAMPLE 8 

Expression of ORF 206 
The primer described in Table 1 for ORF 206 was used to locate and clone ORF 206. 
The predicted gene 206 was cloned in pET vector and expressed in E. coli. The product of 

protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 206- 
25 His purification. Mice were immunized with the purified 206-His and sera were used for 
Western blot analysis (panel B). It is worthnoting that the immunoreactive band in protein 
extracts from meningococcus is 38 kDa instead of 17 kDa (panel A). To gain information on 
the nature of this antibody staining we expressed ORF 206 in E. coli without the His-tag and 
including the predicted leader peptide. Western blot analysis on total protein extracts firom E. 
30 coli expressing this native form of the 206 protein showed a recative band at a position of 3 8 
kDa, as observed in meningococcus. We conclude that the 38 kDa band in panel B) is 
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specific and that anti-206 antibodies, likely recognize a multimeric protein complex. In panel 
C is shown the FACS analysis, in panel D the bactericidal assay, and in panel E) the ELISA 
assay. Results show that 206 is a surface-exposed protein. Symbols: Ml, molecular weight 
marker; TP, A': meningitidis total protein extract; OMV, N. meningitidis outer membrane 
5 vesicle preparation. Arrows indicate the position of the main recombinant protein product (A) 
and the N. meningitidis immunoreactive band (B). These experiments confirm that 206 is a 
surface-exposed protein and that it is a useful immunogen. The hydrophilicity plots, 
antigenic index, and amphipatic regions of ORF 519 are provided in Figure 16. The AMPHI 
program is used to predict putative T-cell epitopes (Gao et al 1989, J. Immunol 143:3007; 
10 Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al. 1992, ScandJ 

Immunol Suppl 11:9). The nucleic acid sequence of ORF 206 and the amino acid sequence 
encoded thereby is provided in Example 1. 

EXAMPLE 9 

15 Expression of ORF 287 

The primer described in Table 1 for ORF 287 was used to locate and clone ORF 287. 
The predicted gene 257 was cloned in pGex vector and expressed in E. coli. The product of 
protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 287- 
GST fusion protein purification. Mice were immunized with the purified 287-GST and sera 

20 were used for FACS analysis (panel B), bactericidal assay (panel C), and ELISA assay (panel 
D). Results show that 287 is a surface-exposed protein. Symbols: Ml, molecular weight 
marker. Arrow indicates the position of the main recombinant protein product (A). These 
experiments confirm that 287 is a surface-exposed protein and that it is a usefiil immunogen. 
The hydrophilicity plots, antigenic index, and amphipatic regions of ORF 287 are provided in 

25 Figure 17. The AMPHI program is used to predict putative T-cell epitopes (Gao et al 1989, J. 
Immunol 143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al. 
1 992, Scand J Immunol Suppl 1 1 :9). The nucleic acid sequence of ORF 287 and the amino 
acid sequence encoded thereby is provided in Example 1 . 
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EXAMPLE 10 
Expression of ORF 406 

The primer described in Table 1 for ORF 406 was used to locate and clone ORF 406. 
The predicted gene 406 was cloned in pET vector and expressed in E. coli. The product of 
protein purification was analyzed by SDS-PAGE. In panel A) is shown the analysis of 406- 
His fusion protein purification. Mice were immunized with the purified 406-His and sera 
were used for Western blot analysis (panel B), FACS analysis (panel C), bactericidal assay 
(panel D), and ELISA assay (panel E). Restdts show that 406 is a surface-exposed protein. 
Symbols: Ml, molecular weight marker; TP, N. meningitidis total protein extract; OMV, N. 
meningitidis outer membrane vescicle preparation. Arrows indicate the position of the main 
recombinant protein product (A) and the N. meningitidis immunoreactive band (B). These 
experiments confirm that 406 is a surface-exposed protein and that it is a useful immunogen. 
The hydrophihcity plots, antigenic index, and amphipatic regions of ORF 406 are provided in 
Figure 18. The AMPHl program is used to predict putative T-cell epitopes (Gao et al 1989, J. 
Immunol 143:3007; Roberts et al. 1996, AIDS Res Human Retroviruses 12:593; Quakyi et al. 
1 992, Scand J Immunol Suppl 11:9). The nucleic acid sequence of ORF 406 and the amino 
acid sequence encoded thereby is provided in Example 1 . 

The foregoing examples are intended to illustrate but not to limit the invention. 
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Claims 

1 . A method for identifying an amino acid sequence, comprising the step of 
searching for putative open reading frames or protein-coding sequences within one or more 

5 of N. meningitidis nucleotide sequences SEQ ID NOs 1-961 and 1068, or even-numbered 
SEQ ID NOs from SEQ ID 962 to SEQ ID 1044. 

2. A method according to claim 1, comprising the steps of searching a 

A^. meningitidis nucleotide sequence for an initiation codon and searching the upstream 
10 sequence for an in-frame termination codon. 

3. A method for producing a protein, comprising the step of expressing a protein 
comprising an amino acid sequence identified according to any one of claims 1-2, 

15 4. A method for identifying a protein in N. mengitidis, comprising the steps of 

producing a protein according to claim 3, producing an antibody which binds to the protein, 
and determining whether the antibody recognises a protein produced by N. menigitidis. 

5. Nucleic acid comprising an open reading frame or protein-coding sequence 
20 identified by a method according to any one of claims 1-2. 

6. A protein obtained by the method of claim 3 . 

7. Nucleic acid comprising one or more of the N. meningitidis nucleotide 

25 sequences SEQ ID NOs 1-961 and 1068, or even-numbered SEQ ID NOs from SEQ ID NO 
962 to SEQ ID NO 1044. 

8. Nucleic acid comprising a nucleotide sequence having greater than 50% 
sequence identity to a nucleotide sequence disclosed in the sequence listing SEQ ID NOs 1- 

30 961 and 1068, or even-numbered SEQ ID NOs from SEQ ID 962 to SEQ ID 1044. 
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9. Nucleic acid comprising a fragment of a nucleotide sequence disclosed in the 
sequence listing SEQ ID NOs 1-961 and 1068, or even-numbered SEQ ID NOs from SEQ ID 
962 to SEQ ID 1044. 

5 10. Nucleic acid according to claim 9, wherein the fragment is unique to the 

genome of N. meningitidis. 

1 1 . Nucleic acid complementary to the nucleic acid of any one of claims 7-10. 

10 12. A protein comprising an amino acid sequence encoded within one or more of 

the N. meningitidis nucleotide sequences SEQ ID NOs 1-961 and 1068, or even-numbered 
SEQ ID NOs from SEQ ID 962 to SEQ ID 1044. 

13. A protein comprising an amino acid sequences having greater than 50% 
1 5 sequence identity to an amino acid sequence encoded within one or more of the 

N. meningitidis nucleotide sequences SEQ ID NOs 1-961 and 1068, or even-numbered SEQ 
ID NOs from SEQ ID 962 to SEQ ID 1044. 

14. A protein comprising a fragment of an amino acid sequence selected from the 
20 group consisting of one or more odd-numbered SEQ ID NOs 963-1 037, amino acid 

sequences having greater than 50% identity with one or more odd-numbered SEQ ID NOs 
963-1045, amino acid sequences encoded within one or more of the N. meningitidis 
nucleotide sequences SEQ ID NOs 1-961 and 1068, and amino acid sequences encoded by 
one or more even-numbered SEQ ID NOs from SEQ ID 962 to SEQ ID 1044. 

25 

1 5 . Nucleic acid encoding a protein according to any one of claims 6-8. 



30 



16. A computer, a computer memory, a computer storage medium or a computer 
database containing the nucleotide sequence of a nucleic acid according to any one of claims 
7-11. 



wo 00/22430 



PCT/US99/23573 



- 124- 

17. A computer, a computer memory, a computer storage medium or a computer 
database containing one or more of the N. meningitidis nucleotide sequences SEQ ID NOs 1- 
961. 

5 1 8. A polyclonal or monoclonal antibody which binds to a protein according to 

any one of claims 12-14 or 6. 

19. A nucleic acid probe comprising nucleic acid according to any one of claims 
5, 7-10, or 15. 

10 

20. An amplification primer comprising nucleic acid according to any one of 
claims 5, 7-10, or 15. 

21. A composition comprising (a) nucleic acid according to any one of claims 5, 
15 7-10, or 15; (b) protein according to any one of claims 12-14; and/or (c) an antibody 

according to claim 18. 



22. The use of a composition according to claim 21 as a medicament or as a 
diagnostic reagent. 

20 

23. The use of a composition according to claim 21 in the manufacture of (a) a 
medicament for treating or preventing infection due to Neisserial bacteria and/or (b) a 
diagnostic reagent for detecting the presence of Neisserial bacteria or of antibodies raised 
against Neisserial bacteria. 

25 

24. A method of treating a patient, comprising administering to the patient a 
therapeutically effective amoimt of a composition according to claim 21. 
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279 (10.5 kDa) 

PURIFICATION 
Ml 279 




279 (10.5 kDa) 
WESTERN BLOT 
TP OMV 



279 (10.5 kDa) 
FACS 




279 (10.5 kDa) 
BACTERICIDAL ASSAY 




(Fig.2^ 

279 (10.5 kDa) 
ELBA assay: positive 



time 
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576 (27.8 kDa) 576 (27.6 kDa) 

PURIFICATION WESTERN BLOT 
Ml 576 TP OMV 

W - ■ 



% 



576 (27.8 kDa) 
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10^ 102 103 10^ 



576 (27.8 kDa) 
BACTERICIDAL ASSAY 




576 (27.8 kDa) 
ELBA assay: positive 



time 
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519 (33 kDa) 519 (33 kDa) 

PURIFICATION WESTERN BLOT 

Ml 519 TP OMV 



Fig.4c 
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-•-519 



600 
500 

I 300 
" 200' 
100 
0 



tO 
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121 (40 kDa) 
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Ml 121 



121 (40 kDa) 
WESTERN BLOT 
TP OMV 




121 (40 kDa) 
FACS 



70-1 
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121 (40 kDa) 
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121 (40 kDa) 
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time 
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128 (101 kDa) 

PURIFICATION 
Ml 126 



128 (101 kDa) 
WESTERN BLOT 
TP OMV 



128 (101 kDa) 128 (101 kDa) 

FACS BACTERICIDAL ASSAY 



-♦-preiiiimuiie 

-A-GST 

-»-128 




time 



128 (101 kDa) 
EUSA assay: positive 
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206 (17 kDa) 

PURIFICATION 
Ml 206 



206 (17 kDa) 

TOTERN BLOT 
TP OMV 



206 (17 kDa) 
FACS 




10° iqI 1o2 



206 (17 kDa) 
BACTERICIDAL ASSAY 
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-±-206 



500t 
400 



I 300 
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206 (17 kDa) 
ELBA assay: positive 
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287 (78 kDa) 
PURIFICATION 
Ml 287 




Tig.sc 

287 (78 kDa) 

BACTERICIDAL ASSAY 

-♦-preimmuiie 
-^GST 

"•-206 



500- 
400 - 
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0-L 
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FACS 

100-1 



0- 




287 (78 kDa) 
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406 (33 kDa) 406 (33 kDa) 

PURIFICATION TfESTERN BLOT 




406 (33 kDa) 
FACS 



100 -, 




loO lol 102 



406 (33 kDa) 
BACTERICIDAL ASSAY 




tims 



406 (33 kDa) 
ELISA assay: positive 



10/1 



PCT/US99/23573 



Hydrophilicity Plot, Antigenic Index and AMPHI Regions 
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Fig. 10 
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Hydrophilicity Plot, Antigenic Index and AMPHI Regions 




CAMPHI Regions - AMPHI 



Fig. 11 
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Hydrophilicity Plot, Antigenic Index and AMPHI Regions 




Fig. 13 
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Hydrophilicity Plot, Antigenic Index and AMPHI Regions 
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Fig. 14 



wo 00/22430 



15/18 



PCTAJS99/23573 



Hydrophilicity Plot, Antigenic Index and AMPHI Regions 
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Fig. 15 
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Hydrophilicity Plot, Antigenic Index and AMPHI Regions 
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Hydrophilicity Plot, Antigenic Index and AMPHI Regions 




Fig. 17 
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APPENDIX A 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 






GNMAA01R 


9866 




^ 


GNMAA27F 


10765 





^ 


GNMAA27R 


11771 


"12130 





GNMBA57F 


5365 


-— 


^ 


GNMBA57R 


"1^^^ 


— — 


^ 


GNMCD17F 







^ 


GNMCD21F 


"14937 




— — = 


ij 


GNMCD21R 


16217 




^ 


GNMCD26F 


"li^iS 


— — 


1 


GNMCD26R 




— — — 





GNMCD28F 


27012 








GNMCD58F 


27525 





^ 


GNMCD58R 


26208 


~26582 


i 


GNI\/ICF39F 


25928 


^6411 





GNMCF39R 


24501 


' — ~ 





GNMCK12F 


18475 








GNMCK12R 


16734 


T7175 


^ 


GNMCL43F 


31264 


"31793 





GNMCL43R 


32603 


33038 





GNMCL77F 


7112 


"P^S 


^ 


GNMCL77R 


8587 




^ 


GNMC024R 


8321 


"8920 


^ 


GNMCP77F 


24906 





^ 


GNI\/1CP77R 


lUP 





^ 


GNMCQ74F 







^ 




GNMCQ74R 





— — 

^ 





GNMGS43F 


"3607 ~~ 







GNMCS56F 





— — 




GNMCS57F 


~7909 




— ^1 


:j 


GNMGV14F 


~5771 




^ 




GNMCV15R 


"7143 


— — 




GNMCV64F 


~23017 


— — — 


^ 


GNMGV64R 


21277 


"22018 


^ 


GNMGV74F 




"17305 


^ 


GNMCV74R 


18058 


18796 




GNMCV83F 


4008 


4503 




GNMCV83R 


2768 


3286 




GNMCY30F 


7157 


7897 




GNMCY30R 


8378 


8912 




GNMCZ78F 


14192 


14686 




GNMCZ78R 


15697 


16234 




GNMCZ93F 


31337 


31862 




GNMCZ93R 


30119 


30639 


2 


GNMAA02F 


27133 


27648 
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Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 




GNMAA02R 


26120 


26546 




GNI\/1AA38F 


16163 


16379 


2 


GNMAA38R 


14815 


15335 


2 


GNMAA46F 


2337 


2704 


2 


GNI\/IAA46R 


3242 


3746 


^ 


GNMBA17F 


15637 


15798 


? 


GNMCD47F 


11113 


11453 


2 


GNMCD78F 


13704 


14196 


2 


GNMCD78R 


15013 


15380 


2 


GNtVICK27F 


4941 


5490 


2 


GNMCK27R 


3670 


4086 


2 


GNtWCL17F 


23033 


23527 


2 


GNMCL17R 


21424 


21995 


2 


GNMCL82F 


24805 


25200 


2 


GNI\4CL82R 


26093 


26659 


2 


GNIVICNIQF 


5929 


6601 


2 


GNMCP32F 


18556 


19103 


2 


GNI\1CP32R 


19956 


20403 


2 


GNIVIGQ84F 


16351 


17040 


2 


GNMCQ92F 


3243 


3692 


2 


GNIVICQ92R 


2022 


2644 


2 


GNMGS51 F 


6645 


7300 


2 


GNIVICV24F 


28139 


28637 


2 


GNMCV25R 


26839 


27453 


2 


GNI\^CV77F 


5149 


5575 


2 


GNI\iCV77R 


6008 


6841 


2 


GNMCY52F 


21892 


22580 




GNMCY52R 


23157 


23662 




GNMCY74F 


21900 


22552 


2 


GNMCY74R 


23519 


24073 


2 


GNIVICZ69F 


1489 


1999 


2 


GNMCZ70F 


1489 


1985 


2 


GNI\/1CZ70R 


2707 


3232 


3 


GNI\/1AA03F 


16946 


17459 


3 


GNIVIAAOSR 


18236 


18447 


3 


GNIWAAI 5F 


3641 


4156 


3 


GNMAA1 5R 


4704 


5176 


3 


GNMCA12F 


8812 


9427 




GNMCB27F 






3 


GNMCB27R 


21309 


21630 


3 


GNMCB59F 


22046 


22554 


3 


GNIV1CB59R 


20650 


21230 


3 


GNIV1CD50F 


8711 


9229 


3 


GNMCF53F 


15376 


15861 


3 


GNIV1CF53R 


16619 


17312 


3 


GNMCF86F 


22322 


22760 
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Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


3 


GNIVICL55F 


12659 


13194 


3 


GNMCL55R 


13854 


14380 


3 


GNMCM46R 


11972 


12662 


3 


GNMCM63F 


7397 


8071 


3 


GNMCM63R 


8734 


9381 


3 


GNMCP05F 


2224 


2964 


3 


GNMCV27F 


10472 


10969 


3 


GNMCV28R 


11455 


12172 


4 


GNMAA04R 


21367 


21727 


4 


GNMAA66F 


9998 


10514 


4 


GNi\/1AA66R 


9150 


9669 


4 


GNMAA70F 


19444 


19961 


4 


GNMAA70R 


20446 


20841 


4 


GNMAB18F 


3431 1 


34576 


4 


GNMAB18R 


32690 


33102 


4 


GNMBA24F 


21408 


21950 


4 


GNMCA71F 


35444 


36106 


4 


GNIVICASSF 


14906 


15535 


4 


GNMCB46F 


27141 


27652 


4 


GNMCB46R 


28558 


29138 


4 


GNMCD85F 


25929 


26447 


4 


GNMCF35F 


37587 


38065 


4 


GNMCF35R 


36661 


37327 


4 


GNMCK26F 


23722 


24268 


4 


GNMCK26R 


25176 


25751 


4 


GNMCK39F 


26270 


26836 


4 


GNiVICK39R 


27576 


27934 


4 


GNI\/ICK64F 


37686 


38053 


4 


GNMCK64R 


36356 


36915 


4 


GNMCL60F 


2659 


3206 


4 


GNMCL60R 


4028 


4520 


4 


GNMCM12F 


21992 


22465 


4 


GNMCM12R 


23335 


23919 


4 


GNMCM80F 


15507 


16171 


4 


GNMCM80R 


16264 


16990 


4 


GNIV1CN08R 


3341 5 


33739 


4 


GNIVIC047F 


23101 


23700 


4 


GNMC047R 


24872 


25344 




GNMCP24F 


34864 


35552 


4 


GNMCP24R 


33620 


34225 


4 


GNMCP44F 


24613 


24976 


4 


GNMCP44R 


25712 


26279 


4 


GNMCQ80F 


35274 


35964 


4 


GNIWCQ80R 


34053 


34632 


4 


GNIWCS02F 


37528 


38035 


4 


GNMCV40F 


33203 


33632 
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Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


4 


GNMCX19F 


37333 


38076 




GNMCX19R 


36229 


36871 


4 


GNMCX25F 


28667 


29362 




GNMCX25R 


27755 


28398 




GNMCX31F 


1336 


2085 




GNMCX31 R 


1 


640 


-.'^ . 


GNMCX38F 


15063 


15774 


4 


GNMCX38R 


14158 


14836 


4 


GNMCY53F 


8159 


8846 


. ..... 


GNMCY53R 


6905 


7405 


4 


GNMCZ25F 


42411 


42912 


4 


GNMCZ25R 


40673 


41229 


4 


GNI\/ICZ27F 


4786 


5245 


4 


GNMCZ27R 


3484 


4030 


5 


GNMAA05F 


5819 


6334 


5 


GNMAA05R 


6898 


7190 


5 


GNMAA09F 


15867 


16369 


5 


GNMAA09R 


15935 


16368 


5 


GNMAA50R 


17996 


18383 


5 


GNMAA51F 


44043 


44409 


5 


GNMAA51R 


43157 


43679 


5 


GNMCA06F 


43254 


43764 


5 


GNMCA72F 


7437 


8102 


5 


GNMCA87F 


36458 


36899 


5 


GNMCB41F 


44654 


45224 


5 


GNMCB41R 


45601 


46039 


5 


GNMCD77F 


46927 


47437 


5 


GNMCD77R 


48378 


48761 


5 


GNMCF13F 


18408 


18911 


5 


GNMCF13R 


16858 


17553 


5 


GNMCF26F 


44946 


45450 


5 


GNMCF26R 


46355 


47018 


5 


GNMCF51F 


31870 


32355 


5 


GNMCK15F 


34028 


34591 


5 


GNMCK15R 


33072 


33560 


5 


GNMCK52F 


13042 


13587 


5 


GNMCK52R 


11706 


12267 


5 


GNMCK67F 


16111 


16399 








14459 


5 


GNMCL36F 


26130 


26644 


5 


GNMCL36R 


24478 


25038 


5 


GNMCL57F 


46883 


47459 


5 


GNMCL57R 


48232 


48759 


5 


GNMCL93F 


6901 


7404 


5 


GNMCL93R 


5298 


5897 


5 


GNMCN22F 


4118 


4792 
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Coordinates of Sequences Released in Contigs 






Coordinate 


Coordinate 


5 


GNIVICN22R 


"~T?Iti 


--^^ 


5 


GNMCN58F 




17798 


5 


GNIVICNSSR 


"15825 


--1^1?? 


5 


GNMCN85F 


38026 





5 


GNMCN85R 


"iT^T 


39669 


5 


GNMCP14F 




-11??? 




GNIV1CP14R 





48597 


5 


GNI\/ICP42F 





^?Z2J 


5 


GNMCP42R 





24875 


5 


^*^l>llVl^rDU^ 


— — 


31537 


5 


GNMCP60R 


— — — 

"kT^ 


-??11? 




GNMCQ39R 




1003 


5 


GNMCS18F 


39300 


39713 


1 


GNi\/ICS74F 


-il^?? 


41970 


_ 




_47085 


47801 







_48062 


48687 




oINML' vol r 


-i??5Z 


33720 


5 


GNMCV53F 





36106 


5 




"l^i^^ 


37232 


5 


GNMCV80F 


_3433 




5 


oiNiviovourv 


"T^Z§i 


2949 


5 


GNMCX14F 


"t^rl 


-1^2?? 


5 


GNMCX14R 







5 


GNIVICYOSF 




_26786 


5 


GNMCY05R 





_25665 


i 


GNMCY24F 







_46684 







J 


47748 





GNMCY75F 


— 


9618 





GNMCY75R 








GNMCZ74F 


— 




33186 


5 


GNMCZ74R 


— — ^ 


-^?1Z? 


g 


GNMAA06F 




-i?^?2 


5 


OI>IIVlMMOor 


— 


22061 


^ 


OIN IVJMMOO r\ 




23120 




oi>J (viMMoyr 







11390 


6 







12870 


B 


oINMAD'+or 


""^^^^ 


14098 


6 


GNMAB56F 




21079 


6 


GNMCA67F 


37544 


38219 


6 


GNMCB01F 


34331 


34902 


6 


GNMCB01R 


35502 


36050 


6 


GNMCD62F 


6122 


6648 


6 


GNMCD62R 


4831 


5183 


6 


GNMCD93F 


1679 


2157 


6 


GNMCD93R 


3169 


3495 


6 


GNMCK06F 


20928 


21478 



wo 00/22430 



-6- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


on ig o. 


Sequence Name 


Coordinate 


Coordinate 







.J^??Z 


20289 





olMIVIOLoyr 


--B£Z2^ 


25251 





O IM m LO «7 K 





23548 










33056 





V3lMIVIWIVI<ilK 





34334 





OfMIVIOINfUK 




14926 





uiMiviouo^r 


-H!®Z 


13922 





olMlVlt/Uoor 


26216 


26827 







25022 


25686 







_16689 


17300 





oi^Jiviooo 1 r 


^52? 


4184 





orMMOb/ ir 





41276 


1 


GNMCS83F 


32447 


33093 






30598 


31235 


fi 




oINMOVUor 


42819 


43260 







44363 


44932 





oNMCVZor 


14981 


15479 


1 


GNMCX36F 


38996 


39738 


1 




39855 


40528 


1 


oINIvlUAoyr 


39178 


39574 





GNMCX59R 


40477 


41178 







24695 


25185 


° 




^^^^ 


16179 





GNMCZ42R 


17126 


17641 







38912 


39364 










38062 





olMIVlAAUf r 


8291 


8808 


^ 


/^MKJIA An7D 
(jNIVlAAU/ K 


. 


9793 





olMIVlAAlUr 


39307 


39822 


^ 


GNMAA10R 


37810 


38060 





IjlNlVlAAf or 





796 


1 


GNMAA76R 


1 117 


1517 


^ 


GNMAB01 F 


33973 


34541 







34969 


35306 





GNMAB04F 


53611 


54157 




GNMAB04R 


52653 


53059 


7 






-?IlZf 


37740 




GNMAB55F 


52123 


52618 


7 


GNMBA81F 


28757 




7 


GNi^BA81R 


27546 


28097 


7 


GNMBB21F 


40393 


40959 


7 


GNIVIBB21R 


39008 


39449 


7 


GNMCA75F 


31357 


32032 


7 


GNMCB25F 


33514 


34085 


7 


GNIVICB25R 


34748 


35431 


7 


GNMCB48F 


14504 


15191 



wo 00/22430 



-7- 



PCT/US99/23S73 



Coordinates of Sequences Released in Contigs 


on ig o. 


Secjuence Name 


Coordinate 


Coordinate 





vjIMIVIODODr 


- ^^^^^ 


37114 





vjr>JIVlL*DOOK 


35390 


36079 










42771 




r3MR/ir'RR7D 


--^H^? 


41740 


J 




--HIliB 


27807 


z 


OIN IVlV/DD«7r\ 





26530 


= 




n.KW/tm'i'io 

oINIVIOUOOK 


-5?!^^ 


50757 





oiN iviUiJO 1 r 





6629 





oiNivior I 1 r 


-55?!^ 


35727 





onlMor 1 1 K 


.^l^ 


37229 





olNMUro/r 


^.^^^^ 


52358 







49997 


50607 





olNML.r4J3r 


40695 


41177 





GNMCF45R 


41795 


42403 


Z 


GNMCF58F 


6844 


7311 




ijNIVIUrOoK 


5528 


6208 


7 






52016 


52469 




GNMCF89R 


53363 


54002 


J 




39350 


39770 


7 


oNIVlLrnoUr 


20170 


20607 


7 


oNlvlOivUzr 


43141 


43483 


7 


O IN MO fSU^ K 


_41418 


41852 


7 


oiNivn-.r\uor 


-11^15 


42407 


7 






_40397 


40952 





oiM(viLrr\/ Or 


_29011 


29346 





r2MRyir'k'7t;D 


27279 


27840 




olNJIVIULo / r 


37566 


38097 


7 


olMIVluLO/K 


38870 


39442 


7 " 


talMMLfLoor 


38465 


38990 


7 







37843 


7 

i 


oINIvlULOUr 


52471 


53006 





GNMCL50R 


51307 


51879 




oINMUIVn OK 


43200 


43943 


7 




oINMOIVl^or 





31677 





GNMCM28R 


29986 


30699 


1 


GNMCM75F 


29426 


30002 


^ 


GNMCM75R 


28230 


28947 




GNMCN07R 


31678 


32296 


7 


GNMCN08F 


30220 


30908 


7 


GNMCN66F 


49682 


50383 


7 


GNMCN68R 


48507 


48702 


7 


GNMCP52F 


53906 


54238 


7 


GNMCP75F 


3335 


3631 


7 


GNMCP75R 


2430 


2916 


7 


GNMCP87F 


19818 


20336 


7 


GNMCP87R 


21539 


21853 



wo 00/22430 



-8- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


SscjUGnce Name 


Coordinate 


Coordinate 


^ — 


oINIVluvJUor 


16992 


17629 


— 


l3lNIVlL.UUoK 





16596 







8173 


8758 


z — 


GNMCQ06R 


6774 


7461 


— 


l3rNlVlL.lJ 1 Ir 


35268 


35953 


— 


OINMLrU! I K 


36305 


36981 


— 


GNMCQ13F 


28320 


29037 


— 




29418 


30079 









40783 


]- — 


GNMCQ24R 


40841 


41510 





orNMUUo/K 




20919 





olNlvlL.U&or 


^'^^-^^ 


41309 





GNMCQ55R 


41980 


42698 


^ 


GNMGS30F 


49344 


49993 


]■ 


GNMCS53F 


16879 


17595 


1 




29469 


29622 


z 


GNMCV01R 


30937 


31651 


^ 


GNMCV17F 


24334 


24812 


]■ 


GNMCV18R 


25368 


26100 





(jNivlOV^or 


26427 


26916 


^- 


GNMCV29R 


24847 


2521 1 


^- 


GNMCV69F 


16647 


17098 


1 


GNMCVSI F 


10009 


10521 


1 


GNMCV91R 


8630 


9420 


1 


GNMCX23F 


36634 


37387 


^ — 


GNMCX23R 


38318 


38893 


1 


GNMCX24R 


33857 


34497 


1 


GNMCX67F 


44537 


45096 


^ 


GNMCX67R 


45763 


46455 


^ 


GNMCX77F 


3423 


4090 


1 


ofMlvIO YODr 


'^''"'^ 


44788 


1 


GNMCY56R 


45883 


46440 


1 


GNIVICY79F 


37394 


38041 


^ 


GNMGY79R 


38954 


39287 





vjrNiviUTo4r 





8023 


^- 


GNMCY84R 


8749 


9223 




GNMCZ21 F 


^^^^'^ 


28986 


J 


GNMCZ21R 


29774 


30347 


8 


GNMAA08F 


3883 




8 


GNMAA08R 


4930 


5373 


8 


GNMAA17F 


20102 


20622 


8 


GNMAA17R 


19135 


19510 


8 


GNMAA18F 


18255 


18770 


8 


GNMAA69F 


3985 


4501 


8 


GNMAABSR 


2840 


3310 


8 


GNMBA02R | 


18827 


19205 



wo 00/22430 



-9- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 




GNMBA38R 


20196 


20729 


? 


GNI\/1BB17F 


16245 


16809 





GNIVIBB17R 


14789 


15278 




GNMCD01F 


1726 


2071 


? 


GNMCD01R 


3032 


3560 


? 


GNMCD57F 


15533 


16080 





GNMCD57R 


14017 


14387 









8074 


a 

° 


opjiviwnoor 





20483 


^ 


GNMCK17F 


_12025 


12589 


° 


oINMUM In. 





14068 





(alNlvlONof r 




12367 





GNMCN37R 


10459 


10898 




GNMCQ71F 


15717 


16394 




GNMCQ71R 


17082 


17799 




GNMCV56F 


2818 


3221 




GNMCV56R 


4184 


4873 




GNI\^CW18F 


11443 


12002 




GNMCW19F 


12243 


12874 


8 


GNMCX44F 


13230 


13907 


8 


GNMCX44R 


12093 


12776 




GNI\^CX81F 


6904 


7509 





GNMCX81R 


8613 


9312 


9 


GNMAA11R 


3820 


4070 




GNMCF10F 


4237 


4718 


? 


GNIWCF10R 


5381 


6021 


? 


GNMCF16F 





6723 




vjNIviOrloK 


4976 


5578 




GNMCH10F 


8003 


8324 


? 


GNMGH10R 


6412 


6686 





GNMCS36F 


8057 


8725 





GNiVlCX89R 


7787 


8447 


10 


GNMAA12F 


700 


1214 


11 


GNMAA13F 


48121 


48639 


11 


GNMAA13R 


49787 


50045 


11 


GNMAA73F 


9309 


9827 


11 


GNMAA73R 


10319 


10725 




GNMAA95F 


5068 


5583 





GNMAA95R 






11 


GNMAB70F 


44475 


44906 


11 


GNMAB70R 


45692 


46213 


11 


GNMAB84F 


34949 


35517 


11 


GNIVIAB84R 


35628 


36115 


11 


GNIV1BA30F 


35071 


35637 


11 


GNIVIBA30R 


34080 


34618 


11 


GNIVIBA65F 


46358 


46779 



wo 00/22430 



-10- 



PCTAJS99/23573 



Coordinates of Sequences Released in Contigs 


ContiQ No. 


Sequsnce Name 


Coordinate 


Coordinate 




IjINMDAD&K 


48334 


48629 


TZ 


GNMBA96F 


25616 


26168 


u 


orMivioAyDK 


27180 


27576 


11 

_ 


oInIViOAi or 


12432 


13093 




oInmomo 1 r 


. 


65033 


11 


OINIVIOD/ Or 


12474 


13003 


11 


Ij IN MOD (OK 


11368 


11898 


il 


(jiNML.b/yr 


12463 


12998 


Ti 


GNMCB79R 


11374 


11879 


u 







13044 


u 




11355 


11761 


Ti 




orMIViOooor 


26453 


27107 





01NJIVlL»DoOr\ 


25225 


25878 




oINIVIOUo f K 


1837 


2210 


11 


olMlvIOU4or 


36014 


36541 


Ti 


OlMIViUU40r( 


^'^^^^ 


37833 


11 


ONIVI0U01 r 


33776 


34331 


11 


ulMMOUOl K 


32513 


32886 


11 


lalNMOrUOr 


61923 


62430 


11 


CjlMMOrUOR 


63324 


63994 


Tj 




olMMt-zr/Ur 




64548 




GNMCF20R 


62670 


63312 


11 

U 


GNMCF27F 


7865 


8322 




GNMCF27R 


6252 


6941 


Vi 


oiNlviorol r 


2643 


3144 


11 


oiMlvlOrol K 


3621 


4255 


1 

1 


olMMOro^r 


34812 


35310 




oNMOrd^R 


33489 


34167 


u 


oi>jiviL»r4*tr 


-2?2L 


8323 


11 


riKthAr^CA AO 


6275 


6806 


u 


ulNlVH_>r04r 


4208 


4682 


11 


v3lMmOr04K 


-^I?? 


6419 


u 


olNMOrliiyr 


4781 _ 


5137 


11 


(jINiVIOnf Dr 


-^Ul 


61203 


Ti 




-^^IH 


62403 


Ti 


oINIVlUrxoUr 


40661 


41202 


Ti 


ulNMOrxOUK 




39847 


Ti 


GNMCL01 F 


59052 


59569 


11 


GNMCL01R 


57689 




11 


GNMCL62F 


36623 


37174 


11 


GNMCL62R 


38138 


38721 


11 


GNMCL65F 


11758 


12282 


11 


GNMCL65R 


13221 


13807 


11 


GNMCM44R 


3393 


4077 


11 


GNMCM85R 


60497 


61118 


11 


GNMCN29F 


75370 


76048 



wo 00/22430 



-11- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


1 1 


GNMCN29R 


76487 


77001 




GNMCN90F 


53115 


53836 


U 


GNMCN90R 


51986 


52525 


11 


GNMCP26F 


38602 


39106 


11 


GNMCP26R 


37257 


37549 


11 


GNMCQ58F 


61396 


62055 


u 


GNMCQ58R 


62637 


63355 


11 


GNMCS12F 


7065 


7598 


11 


GNMCV05F 


4623 


5085 


1^! 


GNMCV06R 


3299 


4083 


""^ 


GNMCV16F 


51884 


52341 


1 1 


GNMC\/17R 


53784 


54354 


11 


GNMCVSSF 


70556 


71043 


1 1 


GNMCV88R 


69005 


69740 


11 


GNMCW41 F 


39495 


40133 


11 


GNMCX04F 


26396 


27141 


11 


GNMCX04R 


25242 


25882 


""^ 


GNMCX65F 


43846 


44360 


11 


GNMCX65R 


45795 


46258 


1 1 


GNMCY01F 


42714 


43318 


11 


GNMCY03F 


16064 


16747 


"•.l. - 


GNMCY03R 


17171 


17665 


11 


GNMCY76F 


36967 


37624 




GNMCY76R 


38440 


38999 


~ 1 

— V- — 


GNMCZ26F 


45695 


46211 


1! 


GNMCZ26R 


46903 


47445 


11 


GNMCZ30F 


53419 


53933 


11 


GNMCZ30R 


54651 


55202 




GNMCZ86R 


43568 


43996 


12 


GNMAA14F 


51035 


51374 


12 


GNMAA62F 


22307 


22668 


"•2 


GNMAA62R 


21211 


21585 


12 


GNMAA84F 


4132 


4648 


12 


GNMAA84R 


3028 


3497 


12 


GNMAB19F 


53197 


53641 


12 


GNMAB19R 


51715 


51941 


12 


GNMAB34F 


59820 


60248 




GNMAB75F 


8230 


8726 


12 


GNMAB75R 






12 


GNMBA16F 


61880 


62448 


12 


GNMBA16R 


63397 


63930 


12 


GNMBA55F 


54894 


55463 


12 


GNMBA55R 


53249 


53699 


12 


GNMBB07F 


45401 


45967 


12 


GNMBB07R 


46474 


46846 


12 


GNMBB23F 


23330 


23896 



wo 00/22430 



-12- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


12 


GNMBB23R 


21762 


22258 


12 


GNMBB28F 


17524 


18093 


12 


GNMBB28R 


19255 


19581 


12 


GNMCA08F 


80267 


80572 


12 


GNIVICA26F 


95492 


95876 


12 


GNMCB71F 


3761 


4447 


12 


GNMCB71R 


2760 


3305 


12 


GNIWCD40F 


25822 


26340 


12 


GNMCD40R 


27392 


27712 


12 


GNMCF14F 


254 


698 


12 


GNMCF23F 


25032 


25512 


12 


GNMCF23R 


26296 


26954 


12 


GNMCF59F 


543 


781 


12 


GNMCF59R 


1909 


2359 


12 


GNMCF75F 


38537 


38993 


12 


GNIV1CH09F 


70027 


70360 


12 


GNMCH09R 


68764 


69057 


12 


GNMCK63F 


82010 


82461 


12 


GNMCK63R 


83284 


83844 


12 


GNMCL27F 


36594 


37139 


12 


GNMCL27R 


38339 


38900 


12 


GNMCL83F 


24969 


25304 


12 


GNMCL83R 


26594 


27175 


12 


GNMCM24F 


58035 


58620 


12 


GNMCM24R 


56788 


57519 


12 


GNMCM26R 


43862 


44449 


12 


GNMCM33F 


59354 


60069 


12 


GNMCI\/133R 


58194 


58939 


12 


GNMCN23F 


31658 


32330 


12 


GNMCN23R 


29999 


30623 


12 


GNMCP07F 


62762 


63498 


12 


GNMCP07R 


61716 


62463 


12 


GNMCQ25F 


29033 


29713 


12 


GNMCQ25R 


27952 


28642 


12 


GNMCQ31F 


33826 


34489 


12 


GNMCQ31R 


32628 


33318 


12 


GNMCQ35F 


99046 


99645 


12 


GNMCQ35R 


100151 


100867 




GNMCS06F 




35790 


12 


GNMCS07F 


38327 


38874 


12 


GNMCS37F 


93209 


93927 


12 


GNMCS45F 


52207 


52867 


12 


GNMCS59F 


49955 


50647 


12 


GNi\/lCS63F 


13556 


14245 


12 


GNMCS75F 


95191 


95899 


12 


GNMCS94F 


39007 


39638 



wo 00/22430 



-13- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


12 


GNIV1CV02F 


96642 


97004 


12 


GNMCV03R 


95290 


96043 


12 


GNMCV19F 


13169 


13632 


12 


GNMCV20R 


11334 


12063 


12 


GNI\^CV67F 


12472 


12929 


12 


GNMCV67R 


11158 


11877 


12 


GNMCV95F 


4801 1 


48518 


12 


GNMCV95R 


48642 


49450 


12 


GNMCX03F 


64105 


64613 


IB 


GNI\/ICX03R 


65502 


66139 


IB 


GNMCX62F 


91416 


91831 


12 


GNMCX68R 


55716 


56405 


12 


GNMCX82F 


55372 


56082 


12 


GNIVICX82R 


54147 


54839 


12 


GNI\^CX90F 


81959 


82454 


12 


GNI\^CX90R 


83099 


83791 


1? 


GNMCX91F 


82087 


82392 


12 


GNMCY47F 


80254 


80920 


12 


GNI\1CY47R 


78886 


79381 


12 


GNMCY81F 


17736 


18413 


12 


GNMCY81R 


19180 


19621 


12 


GNMCZ02F 


24891 


25412 


12 


GNMCZ02R 


26406 


26946 


12 


GNMCZ10F 


34243 


34706 


12 


GNMGZ10R 


35555 


36086 


12 


GNMCZ54F 


59674 


60174 


12 


GNMCZ54R 


58180 


58651 


12 


GNMCZ65F 


70323 


70828 


12 


GNMCZ65R 


71871 


72382 


13 


GNI\^AA19F 


12931 


13449 


15 


GNI\i1AA19R 


11822 


12291 


13 


GNIVIAA55R 


4581 


5101 


13 


GNMAA63F 


36862 


37225 


13 


GNIV1AA63R 


35706 


36096 


13 


GNMAA77F 


20561 


20750 


13 


GNMAB20F 


14416 


14852 


II 


GNI\^BA41R 


21126 


21626 




GNMCB15F 


3423 


3980 


13 


GNMCB15R 


4343 




13 


GNMCB38F 


22717 


23346 


13 


GNIVICB38R 


21451 


22022 


13 


GNMCB57F 


11695 


12343 


13 


GNIVICD23F 


33967 


34506 


13 


GNIV1CD23R 


32498 


32984 


13 


GNIVICD27F 


25756 


26330 


13 


GNMCD27R 


24266 


24695 



wo 00/22430 



-14- 



PCTAJS99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


13 


GNMCD30F 


25823 


26369 


13 


GNMCD30R 


24703 


25016 


13 


GNMCD91F 


36457 


36958 


13 


GNMCF77F 


11321 


11777 


13 


GNMCF77R 


9878 


10580 


13 


GNMCH04F 


9222 


9510 


13 


GNMCK07F 


20658 


21162 


13 


GNMCK07R 


21983 


22516 


13 


GNMCK24F 


11029 


11566 


13 


GNMCK24R 


12531 


12904 


13 


GNMCL26F 


33412 


33883 


13 


GNMCL26R 


32004 


32585 


13 


GNMCL42F 


25017 


25487 


13 


GNMCL42R 


26410 


26988 


13 


GNMCM18F 


9081 


9580 


13 


GNMCM18R 


7774 


8463 


13 


GNMCM79F 


28296 


28959 


13 


GNMCM79R 


29623 


30321 


13 


GNMCN57F 


43959 


44583 


13 


GNMCN57R 


42560 


43109 


13 


GNMC081F 


36053 


36717 


13 


GNMC081R 


34853 


35488 


13 


GNMCP18F 


20932 


21612 


13 


GNMCP18R 


19724 


20394 


13 


GNMCS73F 


26639 


27284 


13 


GNMCS76R 


25539 


26264 


13 


GNMCV09F 


46801 


47242 


13 


GNMCV10R 


45342 


46019 


13 


GNMCV48F 


40436 


40867 


13 


GNMCV81F 


21352 


21853 


13 


GNMCW37F 


45183 


45820 


13 


GNMCX11F 


1628 


2393 


13 


GNMCX11R 


2983 


3629 


13 


GNMCX76F 


41236 


41920 


13 


GNMCX76R 


42308 


42978 


13 


GNMCY20F 


20524 


21188 


13 


GNMCY20R 


19350 


19922 


13 


GNMCY46F 


15097 


15751 


13 


GNMCY46R 


16501 


17054 


13 


GNMCY87F 


21699 


22313 


13 


GNMCY87R 


20274 


20660 


13 


GNMCZ29F 


46571 


47106 


14 


GNMAA20F 


2883 


3399 


15 


GNMAA21F 


12719 


13236 


15 


GNMAA21R 


11967 


12439 


15 


GNMAA83F 


2799 


3318 



wo 00/22430 



-15- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


15 


GNMAA83R 


3978 


4448 


15 


GNMBA09F 


4054 


4621 


15 


GNMCB52F 


15275 


16007 


15 


GNMCB52R 


16498 


16827 


15 


GNMCB77F 


18627 


19229 


15 


GNI\/1CB77R 


20264 


20766 


15 


GNiVICBSaF 


18623 


19271 


15 


GNMCB83R 


20266 


20777 


15 


GNMCL14F 


3072 


3593 


15 


GNi\/ICL14R 


1651 


2228 


15 


GNMCL87R 


9692 


10245 


15 


GNMCN52F 


5357 


5991 


15 


GNMCN52R 


6753 


7339 


15 


GNMCP45F 


11548 


12079 


15 


GNIVICP45R 


13429 


13801 


15 


GNMCQ09F 


19788 


20364 


15 


GNMCQ09R 


18441 


19134 


15 


GNMCQ40F 


20922 


21572 


15 


GNMCQ40R 


22245 


22939 


15 


GNMCV26F 


13405 


13894 


15 


GNMCV27R 


12194 


12828 


15 


GNMCW08F 


23327 


23910 


15 


GNMCX17F 


4323 


5048 


15 


GNMCX17R 


3040 


3690 


16 


GNMAA22F 


54115 


54632 


16 


GNMAA22R 


55087 


55557 


16 


GNMAA40R 


44790 


45219 


16 


GNMAA72F 


58127 


58639 


16 


GNMAA72R 


57179 


57650 


16 


GNMAB05F 


47515 


48081 


16 


GNMAB05R 


46674 


47004 


16 


GNIV1AB06F 


65453 


66020 


16 


GNMAB06R 


66416 


66833 


16 


GNMAB07F 


65453 


65772 


16 


GNMAB28F 


70440 


71008 


16 


GNMAB28R 


71467 


71806 


16 


GNMAB41F 


21694 


22260 


16 


GNMAB54F 


45585 


46150 




oNMAboor 




19084 


16 


GNMBA69F 


9418 


9986 


16 


GNiVIBA69R 


8303 


8848 


16 


GNMBA76F 


39980 


40549 


16 


GNIV1BA76R 


41451 


41944 


16 


GNIVIBA79R 


1185 


1359 


16 


GNMCA89F 


63127 


63781 


16 


GNMCB30F 


5241 


5748 



wo 00/22430 



-16- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


16 


GNIVICB32R 


3919 


4495 


16 


GNIVICD69F 


20174 


20609 


16 


GNIVICD69R 


21508 


21899 


16 


GNIVICD74F 


20264 


20751 


16 


GNIVICF08F 


25798 


26287 


16 


GNMCF08R 


24361 


25036 


16 


GNMCF36R 


42733 


43371 


16 


GNIV1CF46R 


4203 


4663 


16 


GNIVICF48F 


40973 


41398 


16 


GNMCF48R 


39629 


40232 


16 


GNIVICF73F 


27684 


28143 


16 


GNMCF73R 


26442 


27127 


16 


GNIVICF81F 


67923 


68332 


16 


GNIV1CH17F 


68971 


69291 


16 


GNIVICH34R 


22199 


22496 


16 


GNMCK28F 


17936 


18486 


16 


GNMCK28R 


16766 


17104 


16 


GNMCK32F 


20788 


21317 


16 


GNMCK32R 


21768 


22345 


16 


GNMCK85F 


4360 


4910 


16 


GNMCK85R 


5620 


6191 


16 


GNIVICL06F 


5123 


5624 


16 


GNMCL06R 


3812 


4383 


16 


GNIVICL34F 


28058 


28532 


16 


GNIV1CL34R 


26957 


27535 


16 


GNIVICL63F 


31053 


31621 


16 


GNI\/1CL63R 


32284 


32700 


16 


GNMCL70F 


26168 


26684 


16 


GNMCIV131F 


50181 


50817 


16 


GNIVICM31R 


48867 


49582 


16 


GNMCN28F 


69538 


70215 


16 


GNMCN28R 


68459 


69068 


16 


GNMCN84F 


68423 


69040 


16 


GNIVIGN84R 


66998 


67589 


16 


GNMC018F 


2622 


3166 


16 


GNMG018R 


1677 


2332 


16 


GNI\/1C035F 


70510 


71084 


16 


GNiWC035R 


69198 


69780 


16 


GNIV1CP19F 


46453 


47147 


16 


GNMCP19R 


48299 


48962 


16 


GNMGP43F 


14799 


15124 


16 


GNIVICQ02F 


19223 


19930 


16 


GNMCQ02R 


20338 


21001 


16 


GNIVICQ22F 


21355 


22030 


16 


GNMCQ22R 


19917 


20600 


16 


GNIV1CQ53F 


7175 


7907 



wo 00/22430 



-17- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


16 


GNMCQ53R 


8198 


8928 


16 


GNMCQ96R 


29546 


30182 


16 


GNIVICS41F 


29075 


29776 


16 


GNMCS68F 


9040 


9703 


16 


GNMCS75R 


1277 


1893 


16 


GNMCS76F 


2498 


3167 


16 


GNMCV38F 


37452 


37889 


16 


GNMCV55R 


34048 


34804 


16 


GNIV1CV60F 


59043 


59536 


16 


GNMCV60R 


57614 


58367 


16 


GNMCX12F 


3746 


4302 


16 


GNMCX12R 


5111 


5734 


16 


GNMCX21F 


11333 


11997 


16 


GNMCX21R 


10200 


10848 


16 


GNiVICX63F 


225 


712 


16 


GNMCY14F 


72030 


72750 


16 


GNMCY14R 


70731 


71300 


16 


GNMCY23F 


43229 


43994 


16 


GNMCY23R 


42063 


42641 


16 


GNMCY41F 


27768 


28553 


16 


GNMCY41R 


28801 


29356 


16 


GNMCY50F 


59253 


60030 


16 


GNMCY50R 


58094 


58480 


16 


GNMCY59F 


48831 


49574 


16 


GNMCY59R 


50018 


50543 


16 


GNMCZ40F 


12172 


12645 


16 


GNMCZ40R 


13578 


14094 


16 


GNMCZ41F 


60265 


60795 


16 


GNMCZ41R 


61535 


62088 


16 


GNMCZ80F 


29797 


30278 


16 


GNMCZ80R 


28542 


29086 


16 


GNMCZ90R 


34086 


34573 


17 


GNIViAA23F 


31103 


31553 


17 


GNMAA23R 


32120 


32558 


17 


GNMAA31F 


20779 


21295 


17 


GNMAA31R 


21615 


22086 


17 


GNMAA67F 


32770 


33282 


17 


GNMAA67R 


33955 


34310 


17 


GNMAB08F 


35151 


35717 


17 


GNMAB08R 


33887 


34310 


17 


GNMBA18F 


51385 


51952 


17 


GNMBA36F 


8398 


8967 


17 


GNMBA36R 


9832 


10331 


17 


GNIV1BA54F 


57853 


58426 


17 


GNMBA54R 


56651 


57182 


17 


GNMBA74F 


22767 


23336 



wo 00/22430 



PCT/US99/23573 



-18- 



Coordinates of Sequences Released in Contigs 


Contrg No. 


Sequence Name 


Coordinate 


Coordinate 


17 


GNMBA74R 


21413 


21911 


17 


GNMBA85F 


33077 


33648 


17 


GNMBA85R 


31797 


32251 


17 


GNMCA19F 


36042 


36621 


17 


GNIViCB06F 


26433 


26953 


17 


GNMCB06R 


28247 


28714 


17 


GNMCB10F 


38250 


38813 


17 


GNMCB10R 


36756 


37384 


17 


GNMCB82F 


31729 


32377 


17 


GNMCB82R 


32858 


33235 


17 


GNMCF22F 


37912 


38405 


17 


GNMCF22R 


36753 


37421 


17 


GNMCK05F 


7321 


7797 


17 


GNMCK05R 


5987 


6514 


17 


GNI\/ICK57F 


39678 


40046 


17 


GNMCK57R 


40958 


41325 


17 


GNMCM38F 


10453 


11189 


17 


GNMCM38R 


11737 


12393 


17 


GNMCM58F 


22688 


23288 


17 


GNMCM58R 


23628 


24315 


17 


GNMCN30F 


55573 


56235 


17 


GNMCN30R 


56832 


57420 


17 


GNMCO01F 


27343 


28038 


17 


GNIVICO07F 


12194 


12723 


17 


GNIVICO07R 


13433 


14166 


17 


GNMC026R 


5725 


6371 


17 


GNMC043F 


35750 


36434 


17 


GNMC043R 


37161 


37681 


17 


GNMC044F 


32920 


33658 


17 


GNMC044R 


31733 


32327 


17 


GNMC055F 


10439 


11147 


17 


GNMC055R 


12310 


12961 


17 


GNMC056F 


54670 


55322 


17 


GNMC056R 


55704 


56309 


17 


GNMCP57F 


10671 


10932 


17 


GNMGP57R 


8680 


9034 


17 


GNMGP66F 


57727 


58211 


17 


GNMCP66R 


58838 


59416 


17 


GNMCQ42F 


22050 


22733 


17 


GNMCQ42R 


23218 


23942 


17 


GNMCQ81F 


41410 


42152 


17 


GNMCQ81R 


42968 


43610 


17 


GNMCS03F 


707 


1334 


17 


GNMCS35F 


52431 


53137 


17 


GNMCS44F 


35071 


35764 


17 


GNMCS70F 


6806 


7540 



wo 00/22430 



-19- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


17 


GNMCS89F 


38449 


39120 


17 


GNIVICS90R 


39272 


39972 


17 


GNIVICV42F 


51980 


52438 


17 


GNIVICV92F 


43715 


44212 


17 


GNMCV92R 


42381 


43040 


17 


GNIVICX53F 


18076 


18436 


17 


GNMCX53R 


16632 


17267 


17 


GNMCY21F 


26276 


26984 


17 


GNMCY21R 


25220 


25785 


17 


GNMCY43F 


55511 


56209 


17 


GNiVICY58F 


10946 


11675 


17 


GNMCY58R 


9574 


10130 


17 


GNIVICZ14F 


4034 


4557 


17 


GNMCZ14R 


5449 . 


5997 


17 


GNIVICZ81F 


12505 


13016 


17 


GNMCZ81R 


10929 


11485 


18 


GNMAA24F 


14784 


15300 


18 


GNI\/1AA24R 


15822 


16278 


18 


GNIVIAAgiF 


3107 


3623 


18 


GNMAA93F 


14115 


14633 


18 


GNMAA93R 


12779 


13156 


18 


GNIVIAB47F 


6436 


7001 


18 


GNMCA24F 


17599 


18212 


18 


GNIVICB51F 


10483 


11109 


18 


GNMCB51R 


9080 


9547 


18 


GNIVICK79F 


4421 


4931 


18 


GNMCK79R 


5949 


6533 


18 


GNMCM27F 


17624 


18228 


18 


GNiVlCM27R 


16432 


17178 


18 


GNMCM56F 


13615 


14160 


18 


GNiVICMSeR 


14770 


15435 


18 


GNIVICN40R 


15893 


16523 


18 


GNMCN44F 


14468 


15195 


18 


GNMCN44R 


15922 


16524 


18 


GNMCP83F 


14201 


14738 


18 


GNMCP83R 


15673 


16259 


18 


GNIVICY13F 


2490 


3240 


18 


GNMCZ03F 


14791 


15109 


18 


GNMCZ03R 


16087 


16657 


18 


GNMCZ15F 


6918 


7405 


18 


GNMCZ15R 


5483 


6044 


18 


GNMCZ61F 


15232 


15736 


18 


GNMCZ61R 


16804 


17347 


19 


GNMAA25F 


3689 


4210 


19 


GNMAA25R 


4679 


5150 


19 


GNMAA53F 


17218 


17584 



wo 00/22430 



-20- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


ContlQ No. 


Secjuencs Name 


Coordinate 


Coordinate 


tI 







16651 







11317 


11854 


19 


GNMBA56F 


29237 


29799 


19 


GNMBB20F 


42956 


43521 


19 


GNMBB20R 


41743 


42275 


19 


GNMCB08F 


1626 


2186 


19 


GNMCB08R 


2749 


3408 


19 


GNMCB49F 


24542 


25193 


19 


GNMCB49R 


23154 


23800 


^1 


GNMCB50F 


1442 


2136 




GNMCB50R 




1122 


?i 

19 


GNMCB84F 




26173 


— 


GNMCB84R 


24112 


24577 


II 


GNMCD36F 


32463 


32986 


19 


V3NML.r17r 


11 187 


11695 




GNMCF17R 


9855 


10520 


10 

II 




43830 


44301 






42446 


43137 


TH 

II 




46052 


46506 


19 


GNMCH41R 


48920 


49204 


II 


GNMCK19F 


5471 


5977 




GNMCK19R 


6934 


7451 


19 


GNMCK60F 


19464 


19828 


19 


GNMCK60R 


20624 


21 189 


19 


GNMCL07F 


29947 


30379 


19 


GNMCL07R 


31253 


31828 


19 


GNMCL47F 


13187 


13681 


19 


GNMCL47R 


11739 


12309 


19 


GNMCL67R 


10328 


10861 


19 


GNMCM83F 


7074 


7667 


19 


GNMCM83R 


5824 


6505 


1^ 


GNMC1VI87R 


6816 


7475 


19 


GNMCN69F 


21718 


22367 


19 


GNMCN69R 


23279 


23896 


1? 


GNMC019F 


7892 


8641 


19 


GNMC019R 


6509 


7230 


II 


GNMCQ23F 


22847 


23439 




GNMCQ23R 


24531 


25070 


19 


GNIV1CQ63F 






19 


GNMCQ63R 


23445 


24129 


19 


GNMCS09F 


31343 


31944 


19 


GNMCS34F 


32710 


33397 


19 


GNMCV13F 


11334 


11854 


19 


GNMCV14R 


10046 


10690 


19 


GNMCX15F 


8333 


9060 


19 


GNMCX15R 


10180 


10827 



wo 00/22430 



-21- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


19 


GNMCX27F 


8333 


9060 


19 


GNIVICX27R 


10188 


10827 


19 


GNMCX56F 


40847 


41206 


19 


GNMCX56R 


41903 


42589 


19 


GNI\^CX87F 


33938 


34084 


19 


GNi\4CX87R 


31658 


32349 


19 


GNMCY07F 


37467 


38035 


19 


GNIVICZ04R 


24360 


24843 


20 


GNMAA26F 


11314 


11834 


20 


GNMAA34R 


15825 


16187 


20 


GNMBA46F 


9402 


9971 


20 


GNMBA83F 


9481 


10050 


20 


GNMBA83R 


11039 


11224 


20 


GNMBA92F 


3716 


4284 


20 


GNMBA92R 


2437 


2882 


20 


GNI\^CA93F 


10570 


11228 


20 


GNMCB42F 


12316 


12924 


20 


GNMCB42R 


10720 


11380 





GNMCF68F 


145 


549 


20 


GNMCS13F 




3776 


20 


GNMCS19F 


3135 


3707 


20 


GNIVICV43F 


4932 


5463 


20 


GNi\/lCV43R 


3493 


4272 


20 


GNMCX01R 


8929 


9576 


20 


GNMCX32F 


2827 


3562 


20 


GNI\/ICX32R 


1753 


2386 


21 


GNMAA29F 


7970 


8459 


21 


GNI\^AA29R 


6973 


7381 


21 


GNI\/IAA79F 


60518 


61036 


21 


GNMAA79R 


61382 


61783 


21 


GNMAB13F 


91199 


91695 


21 


GNMAB13R 


90065 


90490 


21 


GNMAB15F 


18098 


18666 


21 


GNMAB15R 


17086 


17514 


21 


GNMAB38F 


89228 


89794 


21 


GNMAB49F 


90018 


90554 


21 


GNMAB53F 


57858 


58423 


21 


GNI\/IAB76F 


69791 


70359 




UlNIVIADrOK 






21 


GNMBA08F 


88398 


88961 


21 


GNMBA08R 


89946 


90480 


21 


GNMBA62F 


91149 


91717 


21 


GNMBA62R 


90149 


90587 


21 


GNMBB08F 


57329 


57895 


21 


GNMBB08R 


58629 


59155 


21 


GNI^CB36F 


86172 


86807 



wo 00/22430 



-22- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


°" '9 o- 




Coord in 3t6 


Coordinate 






^JJ^ 


88359 


21 






55889 


21 


GNMCB40R 




^ 





21 


OINIViOL/ 1 or 




— 


21 


GNMCD13R 







-t|=|^ 


21 


GNMCD14F 





^ 


21 


GNMCD22F 


— — — 




21 




-=5|H1 




-?1H? 


91 






J^??? 


91 










91 


olNiVlUtl or 





9401 


91 


^^t^v!^^■^ CD 


joil^ 


1 0933 


T^ 




_28120 


28413 


91 






_?|Z^5 


_30288 






_16224 


-i^^I? 


21 


o In MV^ r\oZK 


~^T5^^ 





21 


oiNlVlwr\yzr 






21 




22899 


23382 


21 

— 


oNIVIOL ] Or 





tI^hi 




f2WMPI 1 J^R 

vjiNiviOL 1 ors. 






21 


UIMIVIOL lor 


— — 

— — — 


~iTp| 


21 


ONIVIWL 1 OPS 





0S790U 


21 

— 


GNMCL35F 




_^?5IZ 




^Kif^A^\ TAD 
V3NIV1L.L00K 


— — — 







21 


olNlVIOIVIU^r 


— 


_I|241 


21 


vjINIVIolViU^rx 


"zz^sl 


-IHHf 


91 


Vj|>ilVlOIVl4Zr 





_45453 


91 




olMIVIOMOl r 





71600 


9^ 




^2059 


72786 




olMMUivlOsr 


-i£!ZZ 





91 

£j 


oINIVlUIVloyK 


_1Z®?? 


48296 


±J 


GNMCM67F 





59524 


?J 


GNMCM67R 


_5Z2§5 


57810 




GNMCN01 F 


.29541 


30134 


91 




.^5155 





91 




J7776 





91 


oiNMOlNUf r 








21 


tjlNlvlUNZUr 


"IMli 




21 


GNMCN20R 




23262 


21 


GNMCN38R 


27178 


27843 


21 


GNMCN42F 


28721 


29325 


21 


GNI^CN42R 


27182 


27579 


21 


GNIV1CN48F 


31545 


32275 


21 


GNIVICN48R 


30254 


30829 


21 


GNMCN56F 


38871 


39524 


21 


GNMCN56R 


37891 


38510 



wo 00/22430 



-23- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


on^g o. 




Coord indto 


Coordinate 




GNIVICN74R 


^Vj^ 


.J^I^ 


21 


GNMCN76F 




-111^2 


21 


GNMCN87F 




-— 


J?H5Z . 


21 


V3INIVIOINO t f\ 





-I12?Z 


21 


GNMC027F 





12686 


21 


GNMC027R 


-— 1 


Jll^^J 


21 


GNMC037R 


^Zl? 


-^1®? 


91 




_81181 


81864 


21 







80668 


21 




— 15| 





21 


GNI\/IC041 R 


63303 


63895 


21 


GNMC062F 


24786 


25412 


21 


GNMC062R 


23316 


^^^^^ - 


91 " 

— 


GNMC069F 


29872 


30526 




GNMC069R 


28732 


29361 


21 


GNMCP53R 


42566 


431 18 


21 


GNMCP68F 


17274 


17781 


91 


GNMCP68R 


18590 


19166 


Y\ 


GNMCP78F 20880 


21383 


21 


GNIVICP78R 


22662 


^3004 


21 


GNMCQ50F 


52354 


53060 


21 


GNIVICQ50R 


53094 


-^IIh^ 


21 




GNMGQ56F 


24974 






GNMCQ56R 


26318 




■fi^fr 


21 


GNMGQ76F 


26247 




21 


GNI\/1CQ76R 


27401 


28002 


21 


GNMCQ86F 


45276 


45978 


21 


GNMCQ86R 


46636 


47364 


91 


GNMCS08F 


7772 


7922 


91 




GNMCS22F 


49814 


5031 1 




GNMGS62F 


56147 


56850 


21 

— 


GNMCS82F 


1052 


^^^^ 




GNiVICW22F 


55865 


56223 


91 


GNIVICX02R 


45344 


45988 


91 




GNMGX09F 


6251 


6961 





GNMCX09R 


4718 


5291 




GNMCX16F 


60624 


61395 


21 


GNMCX16R 


59855 


60393 


21 


GNMCX60F 


40043 


40437 


21 


GNMCX60R 


41031 


41715 


21 


GNMCX74F 


59663 


60376 


21 


GNIVICX74R 


58460 


59136 


21 


GNMCY45F 


42419 


43108 


21 


GNIWICY45R 


44124 


44642 


21 


GNi^CY64F 


58336 


59059 


21 


GNMCY64R 


57045 


57582 



wo 00/22430 



-24- 



PCT/US99/23S73 



Coordinates of Sequences Released in Contigs 


Con^ig No. 




Coordinate 


Coordinate 




GNi\/1CZ28F 


--^^ 


83440 


21 


GiMMCZ28R 




82250 


1^ 


o IN ivn^^H or 


- 

28043 


28521 




olNIVH_.t.4Dr< 


26632 


27064 


21 


OlNIIVIO^/ ft" 


22158 


22671 


91 




23472 


24017 


22 


olMIVIAAoUr 


2165 


2683 


22 




-5512 


3980 









25874 


22 






6103 


22 


orjiviUDjyK 




3945 


22 


olNIVlL.r\4or 


14049 


14546 


22 







13251 


22~ 




-!Z1§? 


18022 


22 


OlNIVlUL^OK 





16700 


22 


olMIVIOIVn OK 


284 


872 


22 







4891 


22 




"TTS^ 


10637 


22 


GNMC022R 





11794 





GNMC023F 




11080 


22 









12303 


22 


GNMCQ04F 




26023 


?| 


orNIVlv_>UU4K 




^4009 


24693 




orNiviv-»ol /r 


5636 


6187 







_21715 


22271 


22 


oiNiviovHOr 


-ilH! 


11552 





GNMCV45R 





12992 


22 

— 


GNIV1CV65F 


_21938 


22388 




ldniviovvi 1 r 


^1268 


21882 


22 




^^^^ 


9752 


22 


v3lNIVlUZ.0oK 





4481 


22 


vjr\iivioz.o/ r 


92 


610 


99 


GNIV1CZ57R 


1391 


1949 


9^ 




GNMAA32R 


501 


916 





GNMAA32F 


34126 


34644 




GNMAA78F 


12905 


13389 


24 




^^^^^ 


12173 


24 






5906 


24 


GNMAA92R 


6781 




24 


GNMBA28F 


25580 


26147 


24 


GNMBA28R 


24581 


24744 


24 


GNI\^BA64F 


44750 


45281 


24 


GNMBA64R 


43715 


43924 


24 


GNMCA03F 


47978 


48229 


24 


GNI^CAIIF 


5227 


5845 


24 


GNMCB53F 


31273 


31860 



wo 00/22430 



-25- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


°"24 ° 


Sequence Name 


Coordinate 


Coordinate 




GNMCB53R 


29940 




24 


GNMCD60F 


— 


49836 


24 




^55?Z 


26427 


24 




.^?Z?f 


54122 


24 


GNMCF33R 





55649 


24 


GNMCF55F 





18818 


24 


oINJIvlL'rODrv 


"5T5§^ 


17304 


24 






31484 


24 


GN[VICF88R 


29803 


30387 


24 


GNMCF94F 


32330 


32765 


24 




^21Zf 


31147 


P4 







21054 


24 


oiMMon/ 1 r 





20708 


24 




31152 


31629 







32456 


33004 


24 


olNML.r\.y4r 


19578 


20116 


24 







18866 


24 


oINMUL / 4r 




16693 


24 


olNlViL^L/ 4K 


"sUf 


18913 


24 


OINIvl^lVIUf r 




49161 


24 


O IN IVI O MU # rv 




-lIf?Z 


48064 


?4 







15471 


94 




15789 


16445 


24 


orMiviL.ivioDr 


32288 


32811 


24 

— 


OlNlVlv^lVlOOK 


-Hill 


31832 




olNIVlL<rM14r 





12112 


24 


GNMCN14R 


12286 


12980 


24 


GNMCN59F 


-^^^ 


47475 


24 




47935 


48525 


24 




-??ZZJ 


23206 


24 " 


GNMCN60R 


24286 


24873 


7A 

?Z 






2415 


?^ 


GNI\/1CN91R 


411 


1022 




GNiV1C065F 


4379 


5044 


24 

— 




^^^^ .- 


6070 




GNMC091 F 


54004 


54574 


5Z 

^ 


GNMC091 R 




55836 




GNMCP23F 


21885 


22586 


24 


GNMCP23R 


20351 




24 


GNMCP71F 


53062 


53612 


24 


GNMCP71R 


54382 


54958 


24 


GNMCQ33F 


31360 


32059 


24 


GNI^CQSSR 


30167 


30816 


24 


GNMCS10F 


52384 


52999 


24 


GNMCS79R 


9557 


10245 


24 


GNiy/ICV21F 


13147 


13602 



wo 00/22430 



-26- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


ContlQ No. 


Sec|U6nce Name 




CoordinstG 


24 


GNMCV22R 


14356 


il^is 


24 


GNMCV63F 


11801 




24 


GNMCV63R 


12681 


13494 


24 


GNMCV66F 


53565 


"54040 


24 


GNMCV66R 


52285 


"53073 


24 


GNMCV73R 


42644 





24 


GNMCV78F 


■^^^ 


"24161 


24 


GNMCV78R 




"25362 


24 


GNMCX22F 


8574 


~9293 


24 


GNMCX22R 





— — — 




GNMCX33F 






-i^P^ 





GNMCX33R 


"21803 




24 


GNMCX34F 


23296 








GNMCX34R 


"liT^S 


"22355 




GNMCX40F 




28866 


24 


GNMCX40R 


"29005 


~T5^M 


24 


GNMCX70F 


~Toii8 




24 


GNMCX70R 


TT461 





24 


GNMCX72F 


"27541 


^7741 


24 


GNMCY35F 


32221 





24 


GNMCY35R 


31087 





24 


GNMCY55F 


45603 


"46359 


24 


GNMCY66R 


2897 


^449 


24 


GNMCY77F 


29179 


29866 


24 


GNMCY77R 




"^fp 


24 


GNMCY82F 


9582 




24 


GNMCY82R 




— — — 


24 


GNMCY94F 




-— — 


24 


GNMCY96F 


~22341 


— 


24 


GNMCY96R 










GNMCZ37F 





"l^lP 


24 


GNMCZ37R 







M 


GNMAA34F 


-— 









GNMBA48F 


"4952 




25 


GNMBA48R 


-— — 




% 


GNMCA16F 












GNMCB09F 


— — 





25 


GNMCB09R 


"23872 




25 


GNMCD04F 


2415 


2961 


25 


GNMCD04R 


1176 


1633 


25 


GNMCK09F 


3101 


3667 


25 


GNMCK09R 


4706 


5009 


25 


GNI^CKSOF 


8704 


9235 


25 


GNMCK50R 


10150 


10511 


25 


GNMCM76R 


3069 


3807 


25 


GNMCM96F 


13743 


14447 



wo 00/22430 



-27- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


ContiQ No. 






Coordinate 


25 


GNMCM96R 


12253 





25 


GNMCN04R 


15105 




25 


GNMCN05F 


13789 


14465 


25 


GNMCP16F 


9455 


' 


25 


GNMCP16R 


8452 


"9076 


25 


GNMCP62R 




l0498 


25 


GNMCX61F 


2026 


~2420 


25 


GNMCX61R 





1850 


25 


GNMCY04F 







25 


GNMCY04R 








25 


GNMCZ20F 


13438 


-— 


25 


GNMCZ20R 


I23T1 


— — — 


26 


GNMAA37F 


"^sTis 





26 


GNMAA37R 


"46181 


-— — 

15T5^ 


26 


GNMAA44F 


38832 





26 


GNMAA44R 


"I^^F 


q° 


26 


GNMBB25F 






26 


GNMBB25R 


"43O8 


— — 


26 


GNMCA28F 




— — 


— — 

— — 1 


26 


GNMCB61F 







26 


GNMGE76F 


l46 




26 


GNMCE76R 


1633 


1980 


26 


GNMCF66F 


27879 





26 


GNMCF66R 


29423 





26 


GNMCL21F 


-p^l 





26 


GNMCL21R 







26 


GNMCL69F 


~3546 


— — 


26 


GNMCL69R 


~4207 


"4797 


26 


GNMCM34R 


"^940 


—— 


26 


GNMCM89F 


~5891 ~ 


— — 


26 


GNMCM89R 


"toTo 


"ttTr 


26 


GNMCM92R 


30750 





26 


GNMGN54F 


28683 


— — — 


26 


GNMCN54R 


"Iriis 







26 


GNMCN79F 




f 


26 


GNMCN79R 


"50402 




26 


GNMC014F 


"33740 





26 


GNMG014R 


"35347 


16067 


26 


GNMCQ26F 


47379 


47982 


26 


GNMCQ26R 


48736 


49406 


26 


GNMCS81F 


36588 


37281 


26 


GNMCS88F 


19142 


19409 


26 


GNMCS89R 


17251 


18014 


26 


GNMCV32F 


18068 


18514 


26 


GNMCV33F 


30470 


30781 


26 1 GNMCV33R 


28683 


29309 



wo 00/22430 



-28- 



PCTAJS99/23573 



Coordinates of Sequences Released in Contigs 


°' 




Coordinate 


Coordinate 




GNMCV70F 


- V,^^ 


- j1?2£2_ 


26 


GNMCV70R 




43282 


26 


GNMCV76F 


--— — 


-52IH2 


26 




— — 




32063 


26 


GNMCV86R 




^3300 


26 


GNMCV87F 





^1522 


26 


GNMCV87R 





-^5522 




GNMCX26R 


"42058 


-^|212 


26 


GNMCY31 R 


-—z 


—522 




OINIVI^^TOOr 





-^5^2? 


26 


GNMCY86R 





-?2Z52 


^ 


GNMCZ13F 





-?15IZ 




GNMCZ13R 


— — — 

— 


25572 


26 


GNMCZ64F 


^6763 


27169 


9R 


o 1 N 1 Vl K 





28534 


26 


oNMoZ./ 1 r 


.^I^ 


47955 


26 


vjINlVlk^Z / 1 K 


^^^^'^ ..... 


46606 


5fi 


GNMCZ95R 





8499 


26 




.522^ 


8483 


97 




(JINIV1AA41 r 


-525? 


3402 




V3lNIVI/\A41 K 





2677 


27 


GNIVIAABSF 





-22?22 


27 

— 


VjINmMMDOtA 




60457 




OiNiVlMDOOr 




-MIZI ^ 


38746 


27 


vjlNIVIMDoOr\ 


36806 


37326 


27 


VJiNIVIMDODr 


J2§1^ 


-£1522 


27 


O In IVIMt5oDr\ 




22429 


27 


GNMAB92F 


Ti?^^ 





27 


GNMBA25F 


28880 


29408 


27 


GNMBA25R 




-£5215 


27 


GNMBA49F 


"55TiZ 


^^^^ 


P 


uiMiviuDicor 


-— 


16497 











15180 


27 


olNlviODOUrv 




14996 


27 


f:sM^/lPR'^^;^^ 




-5512? 


34099 


27 




_5?212 


32548 


27 


OINIVIODO / r 


-5125Z 


-5£52Z 


27 






31421 


27 


GNMCB58F 


30329 




27 


GNMCB56R 


31809 


32460 


27 


GNMCD63F 


15824 


16290 


27 


GNMCD79F 


63644 


64156 


27 


GNMCD79R 


62110 


62364 


27 


GNMCF64F 


41517 


41871 


27 


GNMCF84F 


518 


956 


27 


GNMCF84R 


1834 


2533 



wo 00/22430 



-29- 



PCT/US99/23S73 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sec|uence Name 




CoorditidtG 


27 


GNMCF85F 


6358 




27 


GNMCF85R 


7660 


"~M§I 


27 


GNMCH76F 


-Jllll 


22966 


27 


GNMCH77F 




-imi 


27 


GNMCK01F 


"62394 




27 


GNMCK01R 


60888 





27 


GNMCK18F 


gg^g^ 


■ 


27 


GNMCK18R 




"65724 


27 


GNMCK25F 


27644 


"28213 


27 


GNMCK61F 


32761 


~33T07 


27 


GNMCK61R 


30995 





27 


GNMCK76F 








27 


GNMCK76R 







27 


GNMCK81F 


"eio^ 


— — 


27 


GNMCK81R 


"59863 


"60445 


27 


GNMCK87F 


"36665 


36996 


27 


GNMCK87R 


^4928 


_35498 


27 


GNMCL44F 


^8519 




27 


GNMCL44R 


~37283 


^7863 


27 


GNMCL76F 


"49805 


50300 


27 


GNMCL76R 








27 


GNMCM23F 


"27097 




27 


GNMCM23R 


"25771 




^ 


27 


GNMCN12F 


"8559 




27 


GNMCN12R 


Tiei 


TtM 


27 


GNMCN13F 


68144 




68833 


27 


GNMCN13R 


-^P^^ 




27 


GNMCN17F 







27 


GNMGN17R 


35179 





27 


GNMCN18F 


"l^iTS 


-— 


27 


GNMCN18R 







27 


GNMGN34F 


"59534 


60268 


27 


GNMCN34R 


19457 


"Ti^Ti 


27 


GNMCN38F 







27 


GNMGN61F 




--— 


27 


GNMCN61R 





— 




27 


GNMCN70F 


"32750 




27 


GNMCN80R 


"37432 


"38115 


27 


GNMGN81F 


38597 


39329 


27 


GNMCN81R 


37434 


38096 


27 


GNMCO02R 


59813 


60549 


27 


GNMC038F 


51253 


51930 


27 


GNMCG52R 


33701 


34400 


27 


GNMC057F 


37843 


38469 


27 


GNMC057R 


36757 


37320 


27 


GNMCP50F 


7088 


7522 



wo 00/22430 



-30- 



PCT/US99/23S73 



Coordinates of Sequences Released in Contigs 



Contig No. 


Sequence Name 


Coordinate 


Coordinate 


27 


GNMCP50R 


5679 


6058 


27 


GNMCQ93R 


2933 


3510 


27 


GNIVICS49F 


11768 


12343 


27 


GNMCV50F 


28795 


29193 


27 


GNMCV50R 


27644 


28413 


27 


GNMCV85F 


21568 


22089 


27 


GNMCV85R 


22559 


23351 


27 


GNMCW02F 


47088 


47658 


27 


GNIVICW24F 


56091 


56713 


27 


GNMCY27R 


5455 


5536 


27 


GNMCY33F 


37884 


38598 


27 


GNMCY33R 


39134 


39678 


27 


GNMCY62F 


39794 


40529 


27 


GNMCY62R 


41156 


41683 


27 


GNMCY63F 


39843 


40316 


27 


GNMCY72F 


15711 


16330 


27 


GNiV1CY72R 


14681 


15239 


28 


GNMAA45F 


4450 


4816 


28 


GNMAA54R 


4273 


4733 


28 


GNMCD82F 


1790 


2266 


28 


GNiVICD82R 


3389 


3826 


28 


GNIV1C078F 


6645 


7293 


28 


GNMC086F 


6688 


7310 


28 


GNMC086R 


8039 


8651 


28 


GNMCW05F 


6711 


7331 


28 


GNMCZ09F 


13148 


13623 


28 


GNMCZ09R 


11925 


12279 


29 


GNMAA47F 


27107 


27473 


29 


GNIVIAA47R 


25852 


26322 


29 


GNMAA71F 


19984 


20503 


29 


GNMAA71R 


21408 


21826 


29 


GNMAA80R 


20918 


21282 


29 


GNMAB31F 


32769 


33333 


29 


GNIVIAB31R 


31525 


31942 


29 


GNMAB77F 


21439 


22007 


29 


GNIV1AB77R 


22335 


22857 


29 


GNIVICA22F 


9411 


10028 


29 


GNMCB74F 


26713 


27450 


29 


GNMCB74R 


25839 


26476 


29 


GNMCD08F 


17015 


17514 


29 


GNMCD31F 


19776 


20146 


29 


GNMCF43F 


26320 


26631 


29 


GNiVICF43R 


27361 


28023 


29 


GNIVICF87F 


30819 


31269 


29 


GNiVICF87R 


32125 


32845 


29 


GNMCH41F 


30939 


31379 



wo 00i^2430 



-31- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


29 


GNIV1CK20F 


2703 


3104 


29 


GNMCK20R 


4020 


4346 


29 


GNI\/1CL02F 


32166 


32619 


29 


GNMCL02R 


33533 


33884 


29 


GNMCL12F 


360 


831 


29 


GNMCL12R 


1490 


2039 


29 


GNi\^CL73R 


32923 


33504 


29 


GNMCL85R 


10861 


11425 


29 


GNMCM77F 


17717 


18313 


29 


GNMCM77R 


16440 


17172 


29 


GNMCN64F 


6192 


6750 


29 


GNMCN64R 


7430 


8018 


29 


GNMCN68F 


30002 


30712 


29 


GNMCN83F 


34059 . 


34776 


29 


GNMCN83R 


32873 


33458 


29 


GNIVIC028F 


7197 


7872 


29 


GNMC028R 


8396 


9089 


29 


GNMC053F 


20633 


21342 


29 


GNMC053R 


22061 


22663 


29 


GNMC067F 


1523 


2102 


29 


GNI\/IC067R 


2871 


3524 


29 


GNMCP82F 


30881 


31419 


29 


GNMCP82R 


29550 


30117 


29 


GNMCS26F 


30683 


31168 


29 


GNMCS90F 


16067 


16703 


29 


GNMCS91R 


16949 


17757 


29 


GNMCW09F 


3770 


4381 


29 


GNMCY19F 


14037 


14742 


29 


GNMCY89F 


7491 


8173 


30 


GNMAA48R 


1027 


1347 


30 


GNMAB21R 


3808 


4233 


30 


GNMCC90F 


7658 


8102 


30 


GNMCL10F 


2942 


3470 


30 


GNIVICL10R 


4319 


4883 


30 


GNiV1CM64R 


7645 


8319 


30 


GNMC063F 


12259 


12933 


30 


GNMG063R 


11104 


11789 


30 


GNMCP58F 


8513 


9047 




GNMCP58R 




10707 


30 


GNMCV03F 


10383 


10724 


30 


GNMCV04R 


8992 


9749 


30 


GNIViGX06F 


11346 


12072 


30 


GNMCX06R 


12784 


13418 


30 


GNMCX18F 


11968 


12726 


30 


GNMCX18R 


13547 


14189 


30 


GNMCX71F 


9073 


9653 



wo 00/22430 



-32- 



PCTAJS99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


30 


GNMCX71 R 


7669 


8353 


30 


GNMCY15F 


3214 


3933 


30 


GNMCY15R 


1508 


2079 


31 


GNI\^AA49F 


7079 


7444 


31 


GNMAA49R 


5736 


6260 


31 


GNMBA38F 


692 


1262 





GNi\4BA79F 


7797 


8367 


31 


GNMCL32F 


3721 


4184 


31 


GNMCL32R 


2230 


2815 


31 


GNMCN88F 


1761 


2482 


31 


GNMCN88R 


3292 


3892 





GNMCQ51F 


3265 


3909 


31 


GNMCQ51R 


4295 


5012 


31 


GNMCX63R 


7311 


8010 


31 


GNMCY61R 


4386 1 4868 





GNMCY91F 


2862 


3456 


32 


GNMAA52F 


1739 


2107 


32 


GNMAA52R 


2617 


3138 


32 


GNMAA89F 


13148 


13666 


32 


GNMAB90F 


5624 


6192 


32 


GNMAB90R 


6600 


7118 


32 


GNIVICFSSF 


3403 


3878 


32 


GNMCF38R 




5237 


32 


GNMCK38F 


6598 


7143 


32 


GNMCK38R 


5207 


5792 


32 


GNMCP85F 


6949 


7473 


32 


GNI\flCP85R 


5282 


5869 


32 


GNMCQ07F 


10995 


11623 


32 


GNIVICQOZR 


12678 


13358 


32 


GNMCV23F 


5455 


5912 


32 


GNMCV24R 


4006 


4751 


32 


GNMCX13F 


9897 


10671 


32 


GNMCX13R 


8710 


9345 


32 


GNMGX45F 


3857 


4557 


32 


GNMCX45R 


2724 


3424 


32 


GNMCX57F 


6426 


6642 


32 


GNMCX57R 


6424 


6642 


32 


GNMCY06F 


10183 


10812 


32 


GNMCY06R 


9259 


9808 


33 


GNMAA57F 


2954 


3324 


33 


GNMAA57R 


1924 


2445 


33 


GNMAB30F 


5838 


6402 


33 


GNMAB30R 


4864 


5193 


33 


GNMAB48F 


8816 


9381 


33 


GNMBA50F 


7809 


8374 


33 


GNMBA50R 


6161 


6686 



wo 00/22430 



-33- 



PCT/US99/23S73 



Coordinates of Sequences Released in Contigs 


ContlQ No. 


Sequsnce Nsme 


Coordinate 


Coordinate 


33 




18305 


18918 


^ 


GNMCA80F 


3189 


3849 


33 


GNMCL88F 


12941 


13492 


^ 


GNMCL88R 


11494 


12068 


— ^ 


GNMCM57F 


6934 


7569 


II 


GNMCM57R 


7814 


8548 




GNMCN49F 


18067 


18780 


^ 







17352 


rr 

II 


'jiNiviouo'tr 


J1Z§1^ 


18524 


II 


OlNML.VJO'tK 


-1^?Z£ 


17598 






J3173 


13661 


rr 




(jINIVlUroaK 




15102 




GNMCQ29F 


13338 


14036 


rr 




11998 


12686 


^ 

f£ 


GNMCQ87F 


5967 


6647 




GNMCQ87R 


7354 


7981 







orMivn_>o4/ r 


7736 


8461 


II 


GNMCV30F 


18040 


18529 


f5 


GNMCV31 F 


1808 


2296 


33 


GNMCV31 R 


16473 


17092 


33 


GNMCV32R 


2897 


3643 


33 


GNMCY12F 


13632 


14327 


33 


GNMCY12R 


14891 


15465 


33 


GNMCZ12F 


14374 


14860 


^ 


GNMCZ12R 


12879 


13414 


^ 


GNMAA59R 


20271 


20600 


5£ 


GNMAB63F 


21594 


22082 


5f 


GNMAB87F 


4234 


4656 


^ 


GNMAB93F 


8137 


8678 


^ 


GNMAB93R 


7021 


7543 


^ 


GNMBA26F 


17728 


18076 


^ 


GNMBA31R 




20952 


?1 


GNMBA60F 


2998 


3562 


34 


GNMBA60R 


4887 


5305 


34 


GNMBA89F 


12688 


13184 


34 


GNMBA89R 


11336 


11869 





GNMBA90F 


1963 


2532 




GNMBA90R 


3410 


3918 


34 


GNMBB10F 






34 


GNMBB10R 


20494 


20791 


34 


GNMCA73F 


10776 


11434 


34 


GNMCD09F 


1576 


2151 


34 


GNMCD09R 


202 


580 


34 


GNMCL40F 


6504 


7032 


34 


GNMCL40R 


7906 


8476 


34 


GNMCM41F 


15257 


15722 



wo 00/22430 



-34- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contlg No. 


Sequence Name 


Coordinate 


Coordinate 


34 


GNMCM41R 


13646 


14279 


34 


GNMCM84F 


10143 


10755 


34 


GNMCiV184R 


11418 


12090 


34 


GNMCP65R 


13124 


13566 


34 


GNMCQ57F 


1107 


1637 


34 


GNiVICQ57R 


2550 


3230 


34 


GNMCV15F 


10810 


11260 


34 


GNMCV16R 


9522 


10243 


34 


GNMCX35F 


24683 


25380 


34 


GNMCX35R 


25964 


26651 


34 


GNMCX48F 


27078 


27683 


34 


GNIVICX48R 


25636 


26324 


34 


GNiVICZ82R 


4431 


4970 


35 


GNMAASOR 


9724 


9928 


35 


GNMAA81R 


42064 


42495 


35 


GNMAB09F 


29605 


30171 


35 


GNiVIBA37F 


1865 


2426 


35 


GNMBA37R 


755 


1265 


35 


GNIVICA66F 


14095 


14490 


35 


GNMCB95F 


29548 


30210 


35 


GNIV1CB95R 


28364 


28994 


35 


GNMCD41F 


4298 


4824 


35 


GNMCD41R 


2960 


3326 


35 


GNMGD49F 


47011 


47510 


35 


GNMCD49R 


45671 


46032 


35 


GNIV1CD52F 


46968 


47374 


35 


GNMCE13F 


44763 


45068 


35 


GNMCE13R 


43656 


44020 


35 


GNMCK86F 


32959 


33472 


35 


GNMCL94F 


45671 


46185 


35 


GNMCL94R 


44388 


44948 


35 


GNMCM08F 


32206 


32865 


35 


GNMCM08R 


33769 


34324 


35 


GNMCN16F 


11716 


12326 


35 


GNMCN16R 


10117 


10693 


35 


GNMCN33F 


2863 


3568 


35 


GNMCN33R 


4337 


4927 


35 


GNMC011F 


117 


667 


35 


GNMC011R 


1479 


2220 


35 


GNIVICO20F 


41254 


41858 


35 


GNMCO20R 


42840 


43385 


35 


GNMCP03R 


15135 


15820 


35 


GNMCP33F 


33871 


34386 


35 


GNIy1CP33R 


31902 


32446 


35 


GNMCS31F 


25024 


25611 


35 


GNMCS80F 


26013 


26719 



wo 00/22430 



-35- 



PCTAJS99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


35 


GNMCV20F 


11142 


11598 


35 


GNMCV21R 


9547 


10242 


35 


GNMCV41F 


1508 


1764 


35 


GNMCV41 R 


2993 


3375 


35 


GNMCV46F 


19148 


19638 


35 


GNMCX37F 


10287 


10978 


. 35 


GNI\/ICX75F 


16758 


17496 


35 


GNIVICX75R 


17915 


18615 


35 


GNMCY38F 


35286 


36002 


35 


GNMCY38R 


36447 


37009 


35 


GNMCZ63F 


17628 


18139 


35 


GNMCZ63R 


16308 


16866 


36 


GNMAA61F 


17639 


18003 


36 


GNMAA61R 


19148 


19669 


36 


GNI\^AB14F 


9325 


9894 


36 


GNMAB14R 


10480 


10900 


36 


GNI\^AB23F 


5098 


5510 


36 


GNMAB23R 


5999 


6420 


36 


GNMBA04F 


7545 


8114 


36 


GNMBA04R 


8552 


9087 


36 


GNMCB81 F 


1908 


2616 


36 


GNMCB81 R 


1189 


1739 


36 


GNMCD86F 


266 


753 


36 


GNMCD86R 


1917 


2276 


.?? - 


GNMCL29F 


19188 


19732 


36 


GNMCL46F 


5977 


6459 


36 


GNMCL46R 


6855 


7431 


36 


GNMCL71R 


2286 


2862 


36 


GNMCN74F 


8750 


9460 


36 


GNMCN76R 


7557 


8138 


36 


GNMCP37R 


5055 


5645 


36 


GNMCS39F 


3380 


4120 


36 


GNMCV57F 


6730 


7217 


36 


GNMCV57R 


7760 


8463 


36 


GNMCX54F 


7658 


7977 


36 


GNMCX54R 


6197 


6884 


36 


GNMCY85R 


6699 


7077 


36 


GNMCZ06F 


17782 


18302 




GNMCZ73F 






37 


GNMAA64F 


11674 


12041 


37 


GNMAA64R 


10619 


11088 


37 


GNMAB25F 


25946 


26508 


37 


GNMAB25R 


27013 


27437 


37 


GNMAB32R 


446 


844 


37 


GNMAB89F 


2515 


3085 


37 1 


GNMAB89R 


3403 


3923 



wo 00/22430 



-36- 



PCTAJS99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


37 


GNMAB91F 


19524 


19900 


37 


GNMAB91R 


18389 


18909 


37 


GNMCA84F 


8986 


9651 


37 


GNMCA92F 


10174 


10831 


37 


GNIVICB13F 


28388 


28959 


37 


GNMCB44F 


17203 


17885 


37 


GNMCB44R 


16050 


16676 


37 


GNMCB72F 


15012 


15708 


37 


GNMCB72R 


16365 


16857 


37 


GNMCD32F 


4633 


5112 


37 


GNMCD32R 


2775 


3142 


37 


GNMCD34F 


21613 


22123 


37 


GNMCD34R 


23152 


23452 


37 


GNiVICD43F 


23745 


24277 


37 


GNMCF03F 


23267 


23766 


37 


GNMCF03R 


21815 


22457 


37 


GNIVICK16F 


12575 


13127 


37 


GNMCK69R 


981 


1281 


37 


GNIVICL41 F 


4846 


5357 


37 


GNMCL41R 


6380 


6932 


37 


GNMCM06R 


17272 


17986 


37 


GNMCM82F 


14731 


15358 


37 


GNMCM82R 


15814 


16507 


37 


GNMCQ08F 


20211 


20740 


37 


GNMCQ08R 


18866 


19521 


37 


GNMCQ59F 


16099 


16826 


37 


GNMCQ59R 


15132 


15853 


37 


GNiVICS58F 


16358 


17054 


37 


GNMCV94F 


21841 


22327 


37 


GNi\^CV94R 


20477 


21267 


37 


GNMCX07F 


25522 


26245 


37 


GNMCX07R 


26310 


26960 


37 


GNMCX69F 


10320 


10866 


37 


GNMCX69R 


11842 


12449 


37 


GNMCX93F 


7947 


8360 


37 


GNMCX93R 


6445 


6970 


37 


GNMCY18F 


10778 


11193 


37 


GNMCY18R 


9630 


10203 


37 


GNI\/1CY67F 


26216 


26689 


37 


GNMCY67R 


24586 


24992 


37 


GNMCZ87F 


28035 


28543 


37 


GNMCZ87R 


26386 


26930 


38 


GNMAA74F 


185 


702 


38 


GNMAB59F 


370 


710 


38 


GNMCM68F 


512 


991 


39 


GNiVIBA35F 


3187 


3756 



wo 00/22430 



-37- 



PCTAJS99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


39 


GNMCL49F 


518 


1006 


39 


GNMCM19F 


3839 


4413 


39 


GNMCM19R 


2735 


3480 


39 


GNMCM68R 


3717 


4374 


39 


GNMCN15F 


11 


695 


39 


GNMCN15R 


1589 


2036 


39 


GNMCS14F 


2485 


3018 


39 


GNMCV29F 


4010 


4481 


39 


GNMCV30R 


2621 


3321 


39 


GNMC291F 


4347 


4839 


39 


GNMCZ91R 


3070 


3594 


40 


GNMAA75F 


1493 


2009 


40 


GNMBA84F 


14749 


15315 


40 


GNMBA84R 


13039 


13401 


40 


GNMBB27F 


7061 


7629 


40 


GNMBB27R 


5877 


6280 


40 


GNMCA65F 


10805 


11468 


40 


GNMCFQ1F 


9566 


10068 


40 


GNMCF01R 


7689 


8249 


40 


GNMCF52F 


13446 


13800 


40 


GNMCF52R 


14807 


15448 


40 


GNMCK41F 


1322 


1894 


40 


GNMCK41R 


1 


549 


40 


GNMCN01R 


8094 


8669 


40 


GNMCN02F 


6573 


7152 


40 1 GNMCY39F 


12214 


12932 


40 


GNMCY39R 


11377 


11773 


40 


GNMC275F 


4573 


5040 


40 


GNMCZ75R 


3272 


3824 


41 


GNMAA82F 


1944 


2123 


41 


GNMAA82R 


540 


848 


41 


GNMCA09F 


4155 


4769 


41 


GNMCL45F 


5831 


6382 


41 


GNIVICL45R 


7014 


7592 


41 


GNMCX84F 


6407 


7029 


41 


GNMCX84R 


4937 


5630 


41 


GNMCZ07F 


753 


1256 


41 


GNMCZ07R 


2139 


2681 


42 


GNIVIAA85F 


33488 


34005 


42 


GNMAA85R 


34461 


34906 


42 


GNMAB11F 


27021 


27587 


42 


GNMAB16F 


16195 


16762 


42 


GNMAB16R 


17262 


17683 


42 


GNMAB51F 


32336 


32901 


42 


GNMAB64F 


9048 


9478 


42 


GNMBA52F 


25714 


26279 



wo 00/22430 



-38- 



PCTAJS99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


42 


GNMBA52R 


26930 


27429 


42 


GNMBA63F 


25856 


26418 


42 


GNMCA10F 


9199 


9803 


42 


GNMCA90F 


12306 


12957 


42 


GNMCD76F 


43170 


43607 


42 


GNMCD80F 


25485 


25983 


42 


GNMCD80R 


24100 


24472 


42 


GNMCD81F 


25467 


25981 


42 


GNMCF21F 


42792 


43250 


42 


GNMCF21R 


43820 


44488 


42 


GNMCF79F 


19953 


20412 


42 


GNMCF79R 


18429 


19107 


42 


GNMCH08F 


10638 


10983 


42 


GNMCH61F 


35608 


36017 


42 


GNMCK58F 


11541 


12006 


42 


GNMCK58R 


13419 


13981 


42 


GNMCM03R 


37448 


38182 


42 


GNMCM48F 


1 


622 


42 


GNMCIVI48R 


1215 


1878 


42 


GNMC034F 


11655 


12379 


42 


GNMC034R 


10537 


11201 


42 


GNMCO70R 


39192 


39848 


42 


GNMC084F 


24768 


25509 


42 


GNMC084R 


24098 


24770 


42 


GNMCP29F 


40509 


41019 


42 


GNMCP29R 


38958 


39359 


42 


GNMCQ60F 


38032 


38565 


42 


GNMCQ69F 


8563 


9122 


42 


GNMCQ69R 


6981 


7666 


42 


GNMCS69F 


3213 


3921 


42 


GNMCV25F 


17625 


18095 


42 


GNMCV26R 


16021 


16633 


42 


GNMCX46F 


4775 


5450 


42 


GNMCX46R 


3438 


4125 


42 


GNMCX88R 


17104 


17778 


42 


GNMCY37F 


7223 


7838 


42 


GNMCY37R 


5827 


6323 


42 


GNMCY69F 


22213 


22853 


42 


GNMCY69R 


21279 


21796 


42 


GNMCZ85F 


19300 


19813 


43 


GNMAA86F 


5244 


5760 


43 


GNMAA86R 


4311 


4783 


43 


GNMCS54F 


3163 


3797 


43 


GNMCV84F 


1109 


1600 


43 


GNMCV84R 


2002 


2781 


44 


GNMAA87F 


26931 


27447 



wo 00/22430 



-39- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


44 


GNMAA87R 


27952 


28361 


44 


GNMAA90F 


6714 


7230 


44 


GNMAA90R 


8124 


8276 


44 


GNIV1AB27F 


4036 


4606 


44 


GNMAB27R 


4904 


5327 


44 


GNIVICDHF 


4246 


4813 


44 


GNMCD11R 


5623 


6146 


44 


GNMCQ17F 


6327 


7009 


44 


GNMCQ17R 


7631 


8317 


44 


GNI\/ICQ67F 


1410 


2013 


44 


GNMCQ67R 


2571 


3261 


44 


GNI\/1CS92F 


21392 


22037 


44 


GNiV1CS94R 


22779 


23479 


44 


GNMCS96F 


22613 


22986 


44 


GNIVICX79F 


14815 


15344 


44 


GNMCX79R 


16086 


16760 


44 


GNIVICZ44F 


19312 


19820 


44 


GNIV1CZ44R 


20486 


21049 


45 


GNiy/IAA88F 


3827 


4313 


45 


GNIVIBAOSF 


7835 


8403 


45 


GNMBA05R 


6395 


6824 


45 


GNMCZ39F 


143 


619 


45 


GNMCZ39R 


1545 


2114 


46 


GNMAA94F 


5740 


6254 


46 


GNMAA94R 


6575 


7044 


46 


GNMAB29F 


659 


1225 


46 


GNMAB29R 


1871 


2298 


46 


GNMAB78F 


16523 


16951 


46 


GNMAB78R 


15145 


15666 


46 


GNMCA05F 


4467 


5137 


46 


GNMCD25F 


11261 


11830 


46 


GNIV1CD25R 


10056 


10529 


46 


GNIVICD45F 


4725 


5273 


46 


GNMCD45R 


3455 


3826 


46 


GNIVICD72F 


12772 


13251 


46 


GNMCD72R 


14201 


14542 


46 


GNMCK45F 


6690 


7258 


46 


GNMCK45R 


5280 


5857 


46 


GNMCK53F 


9263 


9636 


46 


GNMCK53R 


10581 


11122 


46 


GNMCN62R 


20059 


20606 


46 


GNMC072F 


11911 


12654 


46 


GNIV1C072R 


10592 


11291 


46 


GNMCS38F 


8266 


8953 


46 


GNMCS50F 


9604 


10313 


46 


GNIVICY09F 


18777 


19443 



wo 00/22430 



-40- 



PCTAJS99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


— 




20339 


20885 







13317 


14054 


46 


GNMCY48R 


12373 


12900 




GNMAB03F 


39285 


39849 


IZ 


GNMAB03R 


40395 


40825 


47 


GNMAB57F 


8125 


8631 


IZ 


GNMAB62F 


5129 


5697 


IZ 


GNI\^AB72F 


25957 


26522 


47 


GNMAB72R 


26812 


27332 


47 


GNI\/tBA39F 


10581 


11112 


.^^ . . 


GNMBA39R 


9272 


9805 


47 


GNMBA68F 


33182 


33747 


47 


GNMBA68R 


32098 


32634 


47 


GNI\^BB31F 


46909 


47485 


47 


GNMBB31R 


45477 45996 


47 


GNMCB64F 


8634 


9225 


47 


GNMCB64R 


9880 


10466 


47 


GNMCD39F 


26389 


26882 


47 


GNI\4CF18F 


42096 


42592 




GNMCF18R 


40473 


41111 


~ 1 

47 


GNMCF47F 


46147 


46634 


47 


GNMCF47R 


44893 


45560 


47 


GNMCK29F 




14820 


47 


GNMCK29R 


12913 


13476 


IZ 


GNMCK33F 


11732 


12246 


£Z 


GNMCK33R 


10377 


10759 


47 


GNMCK51 F 


19259 


19619 


47 


GNMCK51R 


17899 


18248 


47 


GNMCL24F 


21022 


21491 


47 


GNMCL24R 


19374 


19922 


47 


GNMCL66F 


34263 


34768 


47 


GNMCL66R 


35478 


36049 


47 


GNMCM30R 


35959 


36642 


47 


GNMCM37R 


18280 


18787 


47 


GNMCN36F 


28250 


28958 


47 


GN!\/ICN73F 


29393 


30074 


47 


GNMCN73R 


28267 


28921 




GNiVICN93F 


1262 


1971 


47 


GNMCN93R 


2446 


2878 


47 


GNIVIC045F 


14719 


15397 


47 


GNMG045R 


15952 


16635 


47 


GNMC049F 


38118 


38828 


47 


GNMC049R 


39315 


39845 


47 


GNMGO60F 


21461 


22152 


47 


GNIVICO60R 


19964 


20648 


47 


GNMC083R 


16405 


17063 



wo 00/22430 



-41- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


SecjUGPce Name 


Coordinate 


Coordinate 


IZ 


nKiKAnonat: 


_46OT 


^51^ 




oNIVlL.rU{5K 


-— — 





Zt 









-151^2 


zZ 


olMIVlUrlZK 





-1???^ 






-im^ 


-§l|^^^ 


Z7 

% 


nKifi.Ar*r\7no 
VslMIVILrU f Ur( 










liTTT 




"zin^ 


47 


O I>l IVI w ADZ r 


44094 


^^Tz! 


47 









46100 


47 




8582 




47 


GNMCX73R 


7456 


8141 




GNMCY08F 


^^^^^ - 


22785 


47 


GNMCY08R 


20965 


21539 


47 


GNMCY17F 


13457 


14071 


47 


GNMCY17R 


12199 


12710 


47 


GNMCY60F 


4726 


5396 


47 


GNMCY60R 


3394 


3937 


47 


GNMCZ72F 


26112 


26584 


11 


GNMCZ72R 


_27111 


JZ?42 


48 


GNMAB10F 


45864 


46429 


1? 


GNMAB10R 


J6823 


_*IH15 




GNMAB26F 


^l^^l 


JiZZJ 




48 


OlNIVIAbZQK 


17068 


17496 


48 


GNMAB46F 




40166 




vjlNMAB / 1r 


36266 


"Hslf 


4R 




IJiMlvlAD / IK 


J5583 




48 


GNMBA10F 


.?12?j 




24641 


48 


GNMBA10R 


25627 


26158 


48 




2669 




48 


GNMCA69F 


24907 


25573 


48 


GNMCA77F 


44240 


44904 


48 


GNMCB68F 


48529 


49183 


48 


GNMCB68R 


49751 


50229 


48 


GNMCDOSF 


61093 


61524 


48 


GNMCD05R 


47029 


47548 


48 


GNMCD24F 


41436 


41982 


48 


GNMCD24R 


42664 


43161 


— ?i — 


GNMCD70F 


-ipll 


43798 










48 


GNMCE07R 


46605 


47129 


48 


GNMCEGOF 


6380 


6925 


48 


GNMCE90R 


5283 


5799 


48 


GNMCF24F 


56963 


57448 


48 


GNMCF24R 


55581 


56243 


48 


GNMCF70R 


50946 


51263 


48 


GNMCF93F 


46705 


47157 



wo 00/22430 



-42- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


SGC|Li6nc6 Nsmo 


Coordinate 


Coordinate 


— 1? — 


n.Kwur'ca'io 


-^§1^? 




48 




^11^? 


24458 


zl 




60688 


61022 


48 


oiNlViUlNUor 


J^??? 


13530 




V9IMIVIU1S.U0K 





12144 


Zr 

zl 


r2MMr'i^''ji c 


"zfr^ 


48939 








47731 


zs 


V3lNIVI^I\»lDr 










Zr 




0 IN M txH D r\ 




J^551 


15 


oiMlvH_.rvOor 




^9433 


29963 


^1 


GNMCK56R 


27927 


28487 


48 


or>iiviL.i\f ur 


41792 


42156 


48 







43888 


1? 


GNMCL05F 


22552 


23041 


48 


GNMCL05R 


21742 


22293 


48 


GNMCL61F 


15321 


15724 


1? 


GNMCL61R 


14006 


14449 


48 




23803 


24358 


48 


oNMULodK 


22389 


22965 


48 


CjNlvlUlvl40r 


60172 


60784 


1? 


GNMCI\/I40R 


43992 


44623 




V3lNlVlOIV14yK 


63033 


63741 


Zfl 


v^iNIVIOIVIDUr 


^^95 





T 

zi 


ijfMIVIL/IVIaUK 





27929 


tI 


olNlVlOINyOr 


.£1Z5? 


22424 


48 


OINIVILrrMaor 


-5?15? 


53159 


48 




-f^ZZJ 


50550 


z^ 




49060 


49698 


48 


ulMlviOiJi or 





27624 


48 




25392 


26062 


z^ 


(jNIviC>09ur 


JI0121 


10652 


48 


vjfNiVlUUyUK 


_8744 


9318 


48 


GNMCP81 F 


26207 


26575 


48 


GNMCP81 R 


27441 


28017 


48 


GNMCQ16R 




661 


48 


GNMCQ36R 


13779 


14476 


48 


GNMCQ48F 


44157 


44770 




GNMCQ48R 


43032 


43754 


48 


GNMCQ64F 






48 


GNMCQ66F 


12668 


13370 


48 


GNMCQ66R 


13747 


14472 


48 


GNMCQ89F 


16922 


17619 


48 


GNMCV06F 


48695 


49152 


48 


GNMCV07R 


47510 


48231 


48 


GNMCV11F 


26723 


27238 


48 


GNMCV12R 


27836 


28452 



wo 00/22430 



-43- 



PCTAJS99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


48 


GNMCV18F 


35744 


36244 


48 


GNMCV19R 


.^£155 


35205 


48 




J??I? 


-2211 


zl 




-|=|2^ 


-2215 




oiNML* vyor 





-1212? 


2i 

ll 






-1512? 






— — 


-1222? 


Zft 

tI 


O IN M A<^0 r\ 





-11^2Z 


zl 




-52±|| 


59998 









_5222f 


Zr 




——2 ^ 


-^H22? 


Ift 

^1 


olNlVlL.A4^K 


23577 


-^f?21 


48 


oNIVlUAOUr 





30232 


48 


oINMOaOUK 


30956 


31633 


48 


GNMCX80F 


30061 


30735 


48 


GNMCX80R 


31536 


32224 


48 


GNMGY70F 


13009 


13629 


48 


GNMCY70R 


11725 


12281 


48 


GNMCZ84R 


4001 


4533 





GNMAB32F 


_:^[2j! 


684 


50 


GNMAB35F 


JI55Z 


18274 


52 


GNMBA70F 





15180 


52 


GNMBA70R 


J15849 


16383 


50 


GNMCB20R 


J225? 


21453 


Is 


GNMCB89F 


JH522 


13223 


50 


l3lMIVlL>DO»K 


J14045 


J4508 





OMRilOCCTC 


-^5?| 


4879 


1^ 


vjlNlVloro/K 


-|=5L 


.2252 


50 


oiNivionoMr 





^0140 


50 


OMRJir^UQOD 
oNMUnoyK 


JI8248 


18535 


52 


(jlNMUIs4yr 


^0201 


20665 


52 


GNMCK49R 


18771 


19297 


52 


GNMCM01F 


2158 


2770 


50 


orJML.MU1 K 


708 


1314 


50 


GNMCN41 F 


21893 


22570 


50 


GNMGN41R 


23128 


23476 


50 


GNMCO04F 


2174 


2638 


s 


GNMCO04R 


837 


1541 










50 


GNMC082R 


17538 


18219 


50 


GNMCP61F 


13046 


13330 


50 


GNMCP61R 


14605 


15154 


50 


GNMGS33F 


27679 


28393 


50 


GNMCV22F 


21920 


22410 


50 


GNMCV23R 


20644 


21369 


50 


GNMCV47F 


17147 


17659 



wo 00/22430 



-44- 



PCTAJS99/23573 



Coordinates of Sequences Released in Contigs 


Con^g No. 




Coord indtG 


Coordinate 




GNMCV47R 


itIf 


"II^^P 


50 


GNMCV58R 




5 




GNMCV59R 


1242 




50 


GNMCX41 F 


"3977 


— — 


% 


GNMCX41 R 


"5212 


"iSTS 




GNMCY22F 


— — — 


JIt^3 


50 


GNMCY29R 


"22461 


23008 


50 


GNMCY71 F 








50 


GNIVICY71R 


~9041 




50 


GNi\^CZ52F 


20698 


— — 


50 


GNMCZ52R 








— is — 


GNMC294F 






"ili^ 




GNMCZ94R 


— 


-5iz? 


50 


GNMCZ95F 


— — 


-— ^ 


50 


GNMCZ96F 


"3902 






GNMAB39F 


-— — 


"SiTT 




W^ 


GNMBA51 F 


— — 

— — ^ 


_9139 


51 


OINIViDMO 1 r\ 




-— -| 


51 


GNMCL84F 


— 




51 


GNMCL84R 


-— 


"5S?5 

-T^P 


51 


GNIVICO08R 


— — 


Jill 


51 


GNMCY10F 


— — 


'Wr 


51 


GNMCY10R 


_ 


-515 


51 


GNMCZ33F 





"l^i 


51 


GNMCZ33R 


— — 

^ 


5244 


52 


GNMAB40F 




-— 


52 


GNMCB93F 


— — 


-|15f 


52 • GNMCB93R 


■r7^5 





52 : GNMCF69F 


— — 


-fzZ2 


52 


GNMCF69R 


— 


— 


52 


GNMCF92F 


— — 





52 


GNMCF92R 


— — 


.2018 


52 


GNMCL51F 


-— — 

— 


yj^^ 


52 


GNMCL51R 






52 


GNMCM61R 




j}^^ 







52 


GNMCN24F 




■JhTr 


52 


GNMCN24R 


— — r 




52 


GNMC031F 


"17039 





52 


GNMC031R 


18187 


18861 


52 


GNMCS05F 


11540 


12169 


52 


GNMCX49F 


10221 


10402 


52 


GNI\/1CX49R 


8569 


9260 


52 


GNIWCX96R 


4202 


4835 


52 


GNI^CZ83F 


11839 


12349 


52 


GNMCZ83R 


13065 


13609 


53 


GNI^AB50F 


81 


306 



wo 00/22430 



-45- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sec|U6nce Nsme 


Coordinate 


Coordinate 





OlNMABDUr 


- ^^^^ 


5141 




oNMOUDor 


- 


750 


^ 




GNMAB66F 


1314 


1623 


^ 


GNMCB73F 


3597 


4316 






--^2lE 


5644 


^ 


(jiNMulvioor 





3883 


EE 


oiNMOIvloOK 





3288 


55 


o IN M u A'^ / r 





6201 


55 


mM^>!^'Y/17D 





4982 


55 


^^^l^/10V'^4F 

vjlNlv!OTo*fr 





6305 


56 




(^Mh^ ARTQD 
olNlVIMD/ yK 


J 


246 




GNMAB80F 


19923 


20432 


57 


OlNMADoUK 





21624 


~57 


OINIVIDMU/ r 


14530 


15093 


z= 


(jNMBAOfR 


15847 


16378 


r= 


OiNlVlL<B1 IK 


30694 


31243 







olNiV10B4j r 


29518 


30234 


5l 


GNMCB47R 


28242 


28881 




GNMCD55F 


32780 


33171 


rr 

1? 




13260 


13679 


^ 


(jInMUcooR 


14546 


15067 


2L 


GNMCF06F 


16859 


17358 


5Z 


GNMCF06R 


15242 


15921 


% 


GNMCF40F 


18554 


19027 


1? 


GNMCF40R 


19698 


20365 





GNMCF50F 


20435 


20910 


% 


GNMCF50R 


21576 


22262 


_ 


orMlvH_.rtDor 


^0402 


30884 






28818 


29412 


3= 


GNMCF86R 


32361 


33020 


z 


GNMCK71F 


8763 


9100 


== 




-i£255 


10613 




It 


GNMCL95F 


381 1 


4223 


|i 


oINMVjLyoK 




2901 





GNMCN67F 


20529 


21206 


5Z 


GNMCN67R 


19529 


20102 


P 


GNMGP09F 


2860 


3520 




GNMCP09R 




2615 


57 


GNMCP70F 


17618 




57 


GNMCP70R 


18924 


19511 


57 


GNMGP79F 


8875 


9372 


57 


GNMCP79R 


10275 


10855 


57 


GNMCQ41F 


20359 


21104 


57 


GNMCQ41R 


19619 


20345 


57 


GNMCQ44F 


10270 


10898 


57 


GNMCQ44R 


11575 


12244 



wo 00/22430 



-46- 



PCT/US99/23S73 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 


57 


GNMCS16F 


20638 


20868 


57 


GNMCS86F 


30569 


31246 


57 


GNMCV34F 


21537 


21988 


57 


GNMCY40F 


20132 


20855 


57 


GNMCY40R 


19153 


19716 


57 


GNMCY49R 


26133 


26607 


57 


GNMCY80F 


8452 


8787 


57 


GNMCY80R 


6998 


7416 


57 


GNMCY90F 


19373 


19946 




GNMCZ43F 


31206 


31711 


57 


GNMC243R 


32436 


32921 


58 


GNMAB82F 


9525 


10095 


58 


GNMAB82R 


8509 


9029 


58 


GNMC058R 


15112 


15768 


58 


GNMCY78R 


3411 


3857 


58 


GNMCY83F 


11793 


12472 


58 


GNMCY83R 


10643 


11053 


59 


GNMAB85F 


2737 


3302 


59 


GNMAB85R 


1900 


2305 


59 


GNMC033F 


2304 


2941 


59 


GNMC033R 


1257 


1881 


59 


GNMCX86F 


2826 


3461 


59 


GNMCX86R 


1441 


2128 


59 


GNMCZ32F 


1619 


2126 


59 


GNMCZ32R 


2661 


3195 


60 


GNMAB95F 


13774 


14279 


60 


GNMAB95R 


15289 


15810 


60 


GNMCA30F 


937 


1556 


60 


GNMCD44F 


303 


826 


60 


GNMCF04F 


9775 


10276 


60 


GNMCF04R 


8305 


8976 


60 


GNMCF90F 


3862 


4310 


60 


GNMCF90R 


2510 


3187 


60 


GNMCH28F 


9435 


9696 


60 


GNMCK30F 


13554 


14101 


60 


GNMCK30R 


12158 


12740 


60 


GNMCM05F 


9295 


9874 


60 


GNMCM05R 


10879 


11616 




oiNlVlolViOOr 






60 


GNMCM55R 


10796 


11542 


60 


GNMCS87F 


13103 


13751 


60 


GNMCW39F 


15206 


15851 


60 


GNMCX55F 


12701 


12889 


60 


GNMCX55R 


13822 


14516 


60 


GNMCX62R 


1554 


2237 


61 


GNMBA06F 


22890 


23457 



wo 00/22430 



-47- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


Sequence Name 


Coordinate 


Coordinate 





ulNlVlDAUtJK 





24758 







J0158 


30722 


^ 


olNlVH-.t3U4K 





29214 


^ 


vsNMOD^Ir 





24428 


^ 


I3NMO02IR 


25186 


25806 


61 


GNMCB63R 


3796 


4094 


^ 


GNMCB86F 


23284 


23998 


^ 


UlMIVIV/bOOK 





24623 


61 


GNMCD18F 


^^^^^ 


31608 


^1 — . 


GNMCF95F 


^0692 


21018 


^ 




JI2??? 


19872 


^ 


VjlMIVH_I\4Ur 


JL1^2Z 


11811 


^ 


olMIVIL»t\DOr 


-£°2Z 




^ 


GNMCL04F 





20543 


61 


GNMCL04R 


18687 


19271 


^ 


GNMCL20F 


27968 


28464 


^ 


V3lMMUL<£UK 





29840 


61 


GNMCL22F 


13417 


13939 


1^ 


GNMCL22R 





1 5438 


EJ 




Ji!£? 


_34771 


Ii 


o(NIViL.LOJr 




2034 


61 




~5TF 

-fiZ 


^86 


61 


GNMCL90R 


8315 


8896 


61 


GNMCM65F 


15441 


161 17 


61 


GNMCM65R 


14289 


14994 




GNMCM71F 


10516 


11122 


61 


GNMCM71R 


11703 


12405 


61 


GNMC061F 


14512 


15200 




GNMC061 R 


^^^^^ 


13946 


61 


GNMCQ79F 


15902 


16644 


61 


GNMCQ79R 


16726 


17426 


61 


GNMCQ90F 


2342 


3073 


61 


GNMCQ90R 


804 


1426 


^ 


GNMCQ95F 


19198 


19483 




GNMCQ95R 


^^^^^ 


21277 


61 


GNMCS24F 


19718 


20379 


61 


GNMCS46F 


18786 


19366 


61 


GNMCV12F 


30913 


31415 




olNiVlL' V 1 or\ 




32632 


61 


GNMCY25F 


25038 


25729 


61 


GNMCY25R 


26701 


27270 


62 


GNMBA12F 


7833 


8334 


62 


GNMBA66F 


8661 


9232 


62 


GNMBA66R 


9606 


10138 


62 


GNMBB30F 


3235 


3799 


62 


GNMBB30R 


4483 


5016 



wo 00/22430 



-48- 



PCTAJS99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 






Coordinate 


62 


GNMCB05F 


4772 




62 


GNMCB05R 


6111 


6717 


62 


GNMCD67F 


7723 




62 


GNMCF78F 


3478 





62 


GNMCM43F 


12550 


"13285 


62 


GNMCM43R 


11540 


I2T27 


62 


GNMCP28F 


3321 


"3756 


62 


GNMCP28R 


1814 


2235 


62 


GNMCP67F 


2320 


2824 


62 


GNMCP67R 


3943 


4497 


62 


GNMCV62F 


8092 


8582 


62 


GNMCV62R 


9694 




62 


GNMCX39F 


7125 


TT^F 


62 


GNMCX39R 


5729 


"6265 


62 


GNMCZ55F 


5209 


"^724 


62 


GNMCZ55R 


3782 


"4320 


62 


GNMCZ76F 


4455 


4947 


62 


GNMCZ76R 


3027 


3553 


63 


GNMBA13F 


14825 


15391 


63 


GNMBA13R 


13165 


"Hsii 


63 


GNMBA14F 


12491 




63 


GNMBA14R 


13757 


14281 


63 


GNMBA80F 


12477 


12855 


63 


GNMCB32F 


472 


756 


63 


GNMCD42F 


20565 




63 


GNMCF07F 


13708 


14215 


63 


GNMCF07R 


12522 


13201 


63 


GNMCK47F 


10432 


gg^g^ 


63 


GNMCK47R 


9275 




63 


GNMCK91R 


9054 


~9617 


63 


GNMCN32F 


16696 


"17346 


^ 


GNMCN32R 


"liP^ 







GNMCS55F 




~2208 


63 


GNMCX85R 


14727 


"15427 


^1 


GNMCZ1 1R 


iggg^ 







GNMCZ18F 




"2479 


63 


GNMCZ18R 


3109 


3667 




GNMCZ34F 






63 


GNMCZ34R 


12451 


13003 


64 


GNMBA27F 


2420 


2987 


64 


GNMBA27R 


649 


1182 


64 


GNMCK68F 


8858 


9142 


64 


GNMCN47F 


8600 


9323 


64 


GNMCQ47F 


5300 


5761 


64 


GNMCQ47R 


3904 


4632 


64 


GNMCZ45F 


6005 


6471 



wo 00/22430 



PCT/US99/23573 



-49- 



Coordinates of Sequences Released in Contigs 


Contig No. 


Secfuence Name 


"~7509 


Coordinate 


64 


GNI\/ICZ45R 




— |5ZL_ 


64 


GNMCZ89F 


"6722 




65 


GNi\/1BA40F 


"^256 


— — 

--2522 : 


65 


GNMBA40R 


--— — 





65 


GNIVICK42F 


"8125 


--2^52 


65 


GNiVICK42R 


"9146 


--^^ 


65 


GNMCK43F 


"14839 


- ^ 


65 


GNMCK43R 


^^^gg 


^2Z12 


65 


GNi\1Ciy^11R 


2515 




65 


GNMCO03F 


4056 


-— 


65 


GNMCO03R 






-—2 


65 


GNMC032F 


~T5159 


J 


65 


GNMC032R 


11348 




65 


GNMC078R 


QQ^^- 


— — 

-—2= 


65 


GNMCQ10F 




"TsUi 


65 


GNMCQ10R 


l0149 




65 


GNMCQ36F 


19 


__ 


65 


GNMCZ17F 


1839 


— — 


65 


GNMCZ17R 


-51^? 


— — 


65 


GNMCZ24R 




— — 

_z222 


65 


GNMCZ50R 


"?ni7 


-±222 


65 


GNMCZ51F 


"3684 


-—21 


65 


GNMCZ51R 


1216 


-222Z 


66 


GNIVIBA45F 


"5960 


-2^- 


66 


GNMBA45R 


~4417 




66 


GNMBB01F 


^556 


— — 

--2|^ 


66 


GNMBB01R 





-—22 


66 


GNIVICA23F 


"4257 


—2-2 


66 


GNMCN50F 


"6431 


-19^ 


66 


GNI\4CN50R 


-— — 


-22^2 


66 


GNI\^C046F 


-— — 


-i^^2 


66 


GNMC046R 


"706 




66 


GNI\^CQ15F 


Ttss 


— — 

-—2 


66 


GNMCQ15R 


"994 




66 


GNMCZ67F 


"l099 


"7592 


66 


GNMCZ67R 


^554 


-— — 


66 


GNMCZ68F 


"TT30 


"T^ft4 


67 


GNMBA56R 


"828 


"1363 


67 


GNI\/1CZ01F 


1176 


1497 


67 


GNIVICZ01R 


2672 


3147 


68 


GNIVIBA58F 


11648 


12214 


68 


GNMBA58R 


10145 


10680 


68 


GNMBB14F 


7190 


7758 


68 


GNMBB14R 


8579 


9037 


68 


GNI\/ICD71F 


502 


959 


68 


GNMCL54F 


10328 


10882 



wo 00/22430 



-50- 



PCT/US99/23573 



Coordinates of Sequences Released in Contigs 


Contig No. 


SecfLience Name 




Coordinate 


68 


GNMCL54R 


- °° '"^^ 


.^^^ 


68 


GNMCN39F 


13282 


. J|55Z 


68 


GNMCN39R 


"T19T1 


- zt^^ 


68 


GNMCP34F 


12249 




68 


GNMCP34R 


"I052I 




-11251 


68 


GNMCP74F 


"9533 





68 


GNMCP74R 


"8395 




68 


GNMCV35F 


1T085 





68 


GNMCV35R 








69 


GNI\^BA67F 










69 


GNMBA67R 


"96Q1 





69 


GNMCA68F 


~138 





69 


GNMCA95F 


7720 





69 


GNMCB19F 






69 


GNI\/ICB62F 


4968 


-T— 

— — ^ 


69 


GNMCB62R 


-|^^^ 




69 


GNMCD88F 




--— 

— — |- 


69 


GNI\^CD94F 


"T0463 




69 


GNMCD94R 


T2298 


— 


70 


GNMBA87F 


8256 


— — — 

— r| 


70 


GNMBA87R 


6890 





70 


GNMCA76F 


9130 




70 


GNMCB96F 


10306 





70 


GNMCB96R 


llP^ 


— 

^ 


70 


GNMCD20F 






70 


GNMCD20R 


"3980 


"4417 


70 


GNMCE77F 


losio 


IfiSSfi 


70 


GNMCF49F 


13718 







70 


GNMCF49R 


~ 


-i^^li 


70 


GNMCF57F 


"24615 





70 


GNMCF57R 


"23522 


liZ^ 


70 


GNMCF81R 







70 


GNMCK10F 










70 


GNMCL64F 


"^79 


l^n 


70 


GNMCL64R 


l098 




70 


GNMCM94F 


I5929 




iIt^I 


70 


GNMCM94R 







70 


GNMCO70F 


"6253 


"6^65^ 


70 


GNMCP46F 


28269 


28572 


70 


GNMGP46R 


29399 


29799 


70 


GNMCP69R 


14839 


15383 


70 


GNIVICQ60R 


4262 


4932 


70 


GNMCV71F 


1570 


2085 


70 


GNMCV71R 


316 


1151 


70 


GNMCV72F 


29887 


30336 


70 


GNMCV72R 


28290 


29022 
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Coordinate 


70 


GNI\/1CV79F 


"9283 


--|^2? 


70 


GNMCV79R 


"8344 




70 


GNMCV90F 


15009 


— — — 




70 


GNMCV90R 


16482 




70 


GNMCX43F 





l~ 


70 


GNMCX43R 




— — 




-11I?5 




GNMCY28F 


— — ^ 


^?HZZ 


75 


GNMCY28R 




27207 


70 


GNMCZ35F 


— — 

Jj~ 


J5?52 


71 


GNI\/iBB05F 









GNMBB05R 


— — 


-?5I5 


71 


GNMCQ43F 


"tSrH 

"sItt 


-§55I 


71 


GNMCQ43R 




_?224 


71 


GNMCV39F 


1444 


"H^^ 




GNMCVSGR 


1967 


-±°5Z 


71 


GNMCV40R 


1959 





71 


GNMCX05F 


1245 


|Z 


71 


GNMCX05R 


— 


-?^55 


71 


GNMCY02F 


IT233 




71 


GNMCY02R 


I05T9 





71 


GNI\/ICZ22F 


12199 





71 


GNIVICZ22R 


^^IF 


— — — 


71 


GNMCZ62F 




— — 


71 


GNMCZ62R 


"7330 


-— — 

-—^2 


72 


GNMBB26F 


"8760 





72 


GNMBB26R 


-— r 


J?2?? 


72 


GNMCA20F 







_14085 


72 


GNMCA70F 


_3932 


^^^^ 


72 GNMCA83F 


J16236 


16703 


72 


GNMCD73F 


"iHsl 


17077 


72 


GNMCD73R 


IfiST^ 


-15£5? 


72 


GNMCF25F 




_!51£J 


72 


GNMCF25R 




1^5^^ 


J5?5^ 


72 


GNMCM14R 




JL!^^5 


72 


GNMCS42F 




-5125 


_51?f 


72 


GNMCS67F 


"Uri 


"ZS5^^ 


72 


GNMCS91F 




-Z°±2 


72 


GNMCY88F 


I473 


-|1|Z 


73 


GNMCA21F 


82 




73 


GNIV1CA82F 


3679 


3975 


73 


GNI\4CL92F 


4664 


5205 


73 


GNMCL92R 


5485 


5880 


73 


GNIVICM22R 


708 


1428 


73 


GNi^CIVI2gR 


1947 


2683 


73 


GNMC016R 


1657 


2311 


73 


GNIVICV93F 


347 


830 
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GNMCV93R 


-- ^^z? 


^^^^ 


74 


GNMCA78F 




-2±?f 


74 


GNMCB76F 


—33— 


"ziz^ 


74 


GNMCB76R 


— — 


-Zrz5 


74 


GNMCF14R 


"1573 





74 


GNMCF30F 


— — 

-— — 


^i: 


74 


GNMCF30R 







74 


GNMCL96F 




— — ^ 


_1^?Z? 


74 




Ly^^ 


J^?^ 


74 


OlMmOiNO 1 r 


— 





74 


oInIVH^InO I PC 


"TZT^ 


J601 


74 






14895 


74 


orNJiviurMDOK 




J.?H1® 


13517 


tI 







13557 


7A 


vj|\lvlL»U/ rr 





3525 


74 


Ol>ilVIOvJ/ 1 r\ 





^^^^ - 


74 




"T^l?4 


10254 


74 

■ — 


GNMCP02R 




-11??? 


— — 


GNMCQ12F 


— — 

__ 


-H2?B 






— 


J°?5 


74 


OlNmUVOn r 


"I^rZ^ 


-1£5°J 


74 

— 


r^^lMr'\/R^ o 


^^^^ 


-11??? 




OINlVlUAOUr 


"5^T?5 


19013 


74 


oFNivlUAOUrv 




-^2?1? 


74 




"21616 


^ 


74 


\j IN IVI V-* Ayn r\ 


20632 


^1246 


74 

_ 







-1?ZZ1 


_ 


o IN M 1 O r 


T^Pi 


-1???? 





GNMCZ1 6R 




13933 




GNMCZ1 9F 




■4^75^ 


-^?|1^ 


7^ 


oiNiviOMy*f r 




_4349 


7^ 


o IN mV-rDODr 


85 


2819 





GNMCB55R 





3917 


Z5 


orNlvlL.L 1 or 


_4716 





15 


GNMCL1 3R 


_2852 


3443 


75 


GNMCL80F 


4341 


4845 


75 


GNMCL80R 


2903 


3473 


— — 


GNMCM78R 


-^If? 


2889 




GNMCV07F 






75 


GNMCV08R 


1221 


1918 


75 


GNMCV10F 


5011 


5503 


75 


GNMCV11R 


3483 


4212 


75 


GNMCV36F 


4495 


4971 


75 


GNMCV36R 


3285 


3527 


75 


GNMCV52F 


3868 


4351 


75 


GNMCV52R 


2491 


3098 
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75 


GNMCX78F 


3135 


3788 ^"""^^'"^^^ 


75 


GNMCX78R 


4397 


"5087 


76 


GNMCB02F 


2416 


2977 


76 


GNMCB02R 


3352 


3966 


76 


GNMCB07F 


2416 




76 


GNMCB07R 


3352 


3954 


76 


GNMCB12F 




-|^^^ 


76 


GNMCB12R 


3314 




76 


GNMCY54R 


5129 


~566a 


77 


GISIMCB54R 


4435 


"HfiSn 





GNMCB85R 


"^^55 


-5 






GNMCF72F 




"^iz^ 




— 


GNMCF72R 


~5936 

-— 


J°zr : 


— 


GNMCK68R 




^55^ 


— 


GNMCM47F 










GNMCM47R 


~4346 


— 


77 


GNMCXIOF 


~6886 




77 


GNMCX10R 


■^801 


— — 


77 


GNMCZ08R 


"3508 


^954 ~ 


78 


GNMCB60F 


1387 


"2047 


78 


GNMCB60R 


2757 


"3429 


79 


GNMCB65F 




"954 


79 


GNMCB65R 


1598 


"2122 


79 


GNMCY11F 





"4016 


79 


GNMCY1 1 R 




"2911 


81 


GNMCD15F 


-j 


519 


82 


GNMC075R 


"2040 

__ 


"i^H 


83 


GNMCD53F 






84 


GNMCF02F 


"1638 


— — 




GNMCF15F 


"^019 


^523 


85 


GNMCF15R 


"I257 


"I932 


^1 


oiNiviorzor 


-— — 


— — 




GNMCY26R 


-—z 


— — 

--^=5 


86 


GNMCF34F 


-— — 




86 


GNMCF34R 


"^59 


-— 




OINiVl^o^ 1 r 


— — 


"5tq 


87 


GNMCF36F 


"274 


■74F 




GNMCF71R 


~t0636 


"TTTeo 


88 


GNMCL78F 


2657 


3153 


88 


GNMCL78R 


4106 


4665 


88 


GNMCN10F 


7355 


8034 


88 


GNMCQ46F 


10928 


11579 


88 


GNMCQ46R 


9882 


10586 


88 


GNMCQ88F 


574 


1196 


88 


GNMCQ88R 


2017 


2549 


89 


GNMCF76F 


1981 


2406 
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GNMCF76R 





--^ 


89 


GNMCF80F 




1305 


89 


GNMCF80R 


-— — 


2709 


89 


GNMCL16F 


"247 

— — 


-I^ 


89 


GNMCL16R 


-— 


-1Z?1 


89 


GNMCN80F 


1^0 


1493 


^ 


GNMCQ82F 


— — 


2554 






-4§- 


1093 


89 


rsMr/ir'yiap 





2850 


19^ 


vjfMivion/:/ r 


-ii? 


501 


773 


oINIVlOrl / K 




1517 


1^9 




taNMUrl ir 


— 

— 


776 




(jlMMUrvl UK 


756 


1346 


1^3 

— 


oNmv-»OU 1 r 





1344 


— tH — 


vjiNlVIOAUOr 





1001 






-lil^ 


2144 


153 








1204 




GNMGK14R 




352 


155 







445 


T^B 

rig 









9133 






"i^F* 


2694 


156 







1335 


156 


vjINmUO I Or 


Jz2? 


4033 


156 


orvJiVlL'OODr 


"T55^ 


5488 


156 


GNMCV01 F 





2231 


156 


oiNivio vuzrv 





894 


156 


OIN IVIUVOcJr 





3032 


1SR 




-5115 


4231 


157 


oNMLiLI 1 r 





834 


1^7 


(.jfNlVIUL 1 t K 





1846 


l^R 

lis 


oiNiviULour 





2276 




riMt^Ar^t IAD 


-il? 


1028 


T^fi 


orNiMk^ V4yK 


-i^IZ 


5164 


I^q 


orMIVH->L4or 


5961 


6264 


T^q 




Ol\IViOL40K 




5280 


159 


GNMCQ61R 


922 


1535 


159 


GNMCS71 F 


314 


1024 




GNMCY32F 


8722 


9407 


159 


GNMCY32R 


10063 




159 


GNMCY51F 


8917 


9628 


159 


GNMCY51R 


10406 


10895 


160 


GNIViCL58R 


4560 


5111 


160 


GNIVICN05R 


9955 


10528 


160 


GNMG037F 


8602 


9262 


160 


GNMCV04F 


951 


1370 


160 


GNIV1CV05R 


1971 


2742 
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Coordinate 


161 


GNMCN26F 


4880 


5549 


161 


GNMCN26R 


3911 


4533 


— Hi — 


GNMCQ77F 


6238 


6857 






5035 


5760 


161 







4357 


Ifi 






2375 


2916 






1676 


2346 


1fi9 ■ 


oNIViOrN40K 


400 


977 


T^T 




oiNlVlL.Ny^r 


507 


1223 





olMIVIL.rMyiiK 


Jf^f 


21 12 







1142 


1860 





GNMCY42R 


2736 


3290 


163 




4711 


5225 





GNMCZ36R 


6070 


6592 


164 J 


GNMCN94F 


3000 


3708 


164 


GNMCN94R 


1705 


2265 


165 


GNMCQ54F 


51 


677 


165 


GNMCQ54R 


936 


1639 










166 


GNMCS74R 


1 


181 


167 


GNMCV58F 


314 


808 


167 


GNMCZ38F 


6858 


7329 


167 


GNMCZ38R 


5443 


5996 


168 


GNMCX26F 


1 


660 


169 


GNMCX92F 


341 


587 


170 


GNMCY65R 


195 


567 
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APPENDIX B 



MenB ORFs 



Number 1 ORF 

1 . . TTCGGCGA CATCGGCGGT TTGAAGGTCA ATGCCCCCGT CAAATCCGCA 
51 GGCGTATTGG TCGGGCGCGT CGGCGCTATC GGACTTGACC CGAAATCCTA 
101 TCAGGCGAGG GTGCGCCTCG ATTTGGACGG CAAGTATCAG TTCAGCAGCG 
151 ACGTTTCCGC GCAAATCCTG ACTTCsGGAC TTTTGGGCGA GCAGTACATC 
201 GGGCTGCAGC AGGGCGGCGA CACGGAAAAC CTTGCTGCCG GCGACACCAT 
251 CTCCGTAACC AGTTCTGCAA TGGTTCTGGA AAACCTTATC GGCAAATTCA 
301 TGACGAGTTT TGCCGAGAAA AATGCCGACG GCGGCAATGC GGAAAAAGCC 
351 GCCGAATAA 



Number 2 ORF 

1 . .ATTTTGATAT ACCTCATCCG CAAGAATCTA GGTTCGCCCG TCTTCTTCTT 

51 TCAGGAACGC CCCGGAAAGG ACGGAAAACC TTTTAAAATG GTCAAATTCC 

101 GTTCCATGCG CGACGGCTTG TATTCAGACG GCATTCCGCT GCCCGACGGA 

151 GAACGCCTGA CACCGTTCGG CAAAAAACTG CGTGCCGcCA GTwTGGACGA 

201 ACTGCCTGAA TTATGGAATA TCTTAAAAGG CGAGATGAGC CTGGTCGGCC 

251 CCCGCCCGCT GCTGATGCAA TATCTGCCGC TGTACGACAA CTTCCAAAAC 

301 CGCCGCCACG AAATGAAACC CGGCATTACC GGCTGGGCGC AGGTCAACGG 

351 GCGCAACGCg CTTTCGTGGG ACGAAAAATT CGCCTGCGAT GTTTGGTATA 

401 TCGACCACTT CAGCCTGTGC CTCGACATCA AAATCCTACT GCTGACGGTT 

451 AAAAAAGTAT TAATCAAGGA AGGGATTTCC GCACAGGGCG AACA.aCCAT 

501 GCCCCCTTTC ACAGGAAAAC GCAJ\ACTCGC CGTCGTCGGT GCGGGCGGAC 

551 ACGGAAAAGT CGTTGCCGAC CTTGCCGCCG CACTCGGCCG GTACAGGGAA 

601 ATCGTTTTTC TGGACGACCG CGCACAAGGC AGCGTCAACG GCTTTTCCGT 

651 CATCGGCACG ACGCTGCTGC TTGAAAACAG TTTATCGCCC GAACAATACG 

701 ACGTCGCCGT CGCCGTCGGC AACAACCGCA TCCGCCGCCA AATCGCCGAA 

7 51 AT^GCCGCCG CGCTCGGCTT CGCCCTGCCC GTACTGGTTC ATCCGGACGC 

801 GACCGTCTCG CCTTCTGCAA CAGTCGGACA AGGCAGCGTC GTTATGGCGA 

851 AAGCGGTCG.. 



Number 3 ORF 

1 . . AACCATATGG CGATTGTCAT CGACGAATAC GGCGGCACAT CCGGCTTGGT 

51 CACCTTTGAA GACATCATCG AGCAAATCGT CGGCGAAATC GAAGACGAGT 

101 TTGACGAAGA CGATAGCGCC GACAATATCC ATGCCGTTTC TTCAGACACG 

151 TGGCGCATCC ATGCAGCTAC CGAAATCGAA GACATCAACA CCTTCTTCGG 

201 CACGGAATAC AGCATCGAAG AAGCCGACAC CATT . GGCGG CCTGGTCATT 

251 CAAGAGTTGG GACATCTGCC CGTGCGCGGC GAAAAAGTCC TTATCGGCGG 

301 TTTGCAGTTC ACCGTCGCAC GCGCCGACAA CCGCCGCCTG CATACGCTGA 

351 TGGCGACCCG CGTGAAGTAA GC ACCGC CGTTTCTGCA 

401 CAGTTTAG 



Number 4 ORF 

1 ATGCGCGGCG GCAGGCCGGA TTCCGTTACC GTGCAGATTA TCGAAGGTTC 

51 GCGTTTTTCG CATATGAGGA AAGTCATCGA CGCAACGCCC GACATCGGAC 

101 ACGACACCAA AGGCTGGAGC AATGAAAAAC TGATGGCGGA AGTTGCGCCC 

151 GATGCCTTCA GCGGCAATGC TGAAgGGCAG TTTTTCCCCG ACAGCTACGA 

201 AATCGATGCG GGCGGCAGTG ATTTGCAGAT TTACCAAACC GCCTACAAgG 

251 GCGATGCAAC GCCGCCTGAA TGAgGGCATG GGAAAGCAGG CAGGACGGGC 

301 TGCCTTATAA AAACCCTTAT GAAATGCTGA TTATGGCGAr CCTGGTCGAA 

351 AAGGAAACAG GGCATGAAGC CGAsCsCGAC CATGTcGCTT CCGTCTTCGT 

401 CAACCGCCTG AAAATCGGTA TGCGCCTGCA AACCgAssCG TCCGTGATTT 
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4 51 ACGGCATGGG TGCGGCATAC AAGGGCAAAA TCCGTAAAGC CGACCTGCGC 
501 CGCGAaACGC CGTACAACAC CTACACGCGC GGCGGTCTGC CGCCAACCCC 
551 GATTGCC-CTG CCC. . 



Number 5 ORF 

1 CGTTTCAAAA TGTTAACTGT GTTGACGGCA ACCTTGATTG CCGGACAGGT 

51 ATCTGCCGCC GGAGGCGGTG CGGGGGATAT GAAACAGCCG AAGGAAGTCG 

101 GAAAGGTTTT CAGAAAGCAG CAGCGTTACA GCGAGGAAGA AATCAAAAAC 

151 GARCGCGCAC GGCTTGCGGC AGTGGGCGAG CGGGTTAATC AGATATTTAC 

201 GTTGCTGGGA GGGGAAACCG CCTTGCAAAA GGGGCAGGCG GGAACGGCTC 

251 TGGCAACCTA TATGCTGATG TTGGAACGCA CAAAATCCCC CGAAGTCGCC 

301 GAACGCGCCT TGGAAATGGC CGTGTCGCTG AACGCGTTTG AACAGGCGGA 

351 AATGATTTAT CAGAAATGGC GGCAGATTGA GCCTATACCG GGTAAGGCGC 

4 01 AAAAACG3GC GGGGTGGCTG CGGAACGTGC TGAGGGAAAG AGGAAATCAG 

4 51 CATCTGGACG GACGGGAAGA AGTGCTGGCT CAGGCGGACG AAGGACAG 



Number 6 ORF 

1 AACCTCTACG CCGGCCCGCA GACCACATCC GTCATCGCAA ACATCGCCGA 

51 CAACCTGCAA CTGGCCAAAG ACTACGGCAA AGTACACTGG TTCGCCTCCC 

101 CGCTCTTCTG GCTCCTGAAC CAACTGCACA ACATCATCGG CAACTGGGGC 

151 TGGGCGATTA TCGTTTTAAC CATCATCGTC AAAGCCGTAC TGTATCCATT 

201 GACCAACGCC TCTTACCGCT CTATGGCGAA AATGCGTGCC GCCGCACCCA 

251 AACTGCAAGC CATCAAAGAG AAATACGGCG ACGACCGTAT GGCGCAACAA 

301 CAGGCGATGA TGCAGCTTTA CACAGACGAG AAAATCAACC CGgCTGGGCG 

351 GCTGCC7GCC TATGCTGTTG CAAATCCCCG TCTTCATCGG ATTGTATTGG 

401 GCATTGTTCG CCTCCGTAGA ATTGCGCCAG GCACCTTGGC TGGGTTGGAT 

451 TACCGACCTC AGCCGCGCCG ACCCCTACTA CATCCTGCCC ATCATTATGG 

501 CGGCAACGAT GTTCGCCCAA ACTTATCTGA ACCCGCCGCC GAcCGACCCG 

551 ATGCagGCGA AAATGATGAA AATCATGCCG TTGGTTTTCT CsGwCrTGTT 

601 CTTCTTCTTC CCTGCCGGks TGGTATTGTA CTGGGTAGTC AACAACCTCC 

651 TGACCATCGC CCAGCAATGG CACATCAACC GCAGCATCGA AAAACAACGC 

7 01 GCCCAAGGCG AAGTCGTTTC CTAA 



Number 7 ORF 

1 . . GCCGTCTTAA TCATCGAATT ATTGACGGGA ACGGTTTATC TTTTGGTTGT 

51 NAGCGCGGCT TTGGCGGGTT CGGGCATTGC TTACGGGCTG ACCGGCAGTA 

101 CGCCTGCCGC CGTCTTGACC GNCGCTCTGC TTTCCGCGCT GGGTATTTNG 

151 TTCGTACACG CCAAAACCGC CGTTAGAAAA GTTGAAACGG ATTCATATCA 

2 01 GGATTTGGAT GCCGGACAAT ATGTCGAAAT CCTCCGNCAC ACAGGCGGCA 

251 ACCGTTACGA AGTT.TTTAT CGCGGTACG. ACTGGCAGGC TCAAAATACG 

301 GGGCAAGAAG AGCTTGAACC AGGAACTCGC GCCCTCATTG TCCGCAAGGA 

351 AGGCAACCTT CTTATTATCA CACACCCTTA A 



Number 8 ORF 

1 ATGTwTGATT TCGGTTTrGG CGArCTGGTT TTTGTCGGCA TTATCGCCCT 

51 GATwGtCCTC GGCCCCGAAC GCsTGCCCGA GGCCGCCCGC AyCGCCGGAC 

101 GGcTCATCGG CAGGCTGCAA CGCTTTGTCG GcAGCGTCAA ACAGGAATTT 

151 GACACTCAAA TCGAACTGGA AGAACTGAGG AAGGCAAAGC AGGAATTTGA 

2 01 AGCTGCCGcC GCTCAGGTTC GAGACAGCCT CAAAGAAACC GGTACGGATA 

251 TGGAAGGCAA TCTGCACGAC ATTTCCGACG GTCTGAAGCC TTGGGAAAAA 

301 CTGCCCGAAC AGCGGACACC TGCCGATTTC GGTGTCGATG AAAACGGCAA 

351 TCCGCT.TCC CGATGCGGCA AACACCCTAT CAGACGGCAT TTCCGACGTT 

4 01 ATGCCGTC. 
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Number 9 ORF 

1 ATGCAAGCAC GGCTGCTGAT ACCTATTCTT TTTTCAGTTT TTATTTTATC 

51 CGC.TGCGGG ACACTGACAG GTATTCCATC GCATGGCGgA GkTAAACgCT 

101 TTgCGGTCGA ACAAGAACTT GTGGCCGCTT CTGCCAGAGC TGCCGTTAAA 

151 GACATGGATT TACAGGCATT ACACGGACGA AAAGTTGCAT TGTACATTGC 

201 CACTATGGGC GACCAAGGTT CAGGcAGTTT GACAGGGGGG ICGCTACTCC 

251 ATTGATGCAC kGrTwCsTGG CGAATACATA AACAGCCCTG CCGTCCGTAC 

301 CGATTACACC TATCCACGTT ACGAAACCAC CGCTGAAACA ACATCAGGCG 

351 GTTTGACAGG TTTAACCACT TCTTTATCTA CACTTAATGC CCCTGCACTC 

4 01 TCTCGCACCC AATCAGACGG TAGCGGAAGT AAAAGCAGTC TGGGCTTAAA 

451 TATTGGCGGG ATGGGGGATT ATCGAAATGA AACCTTGACG ACTAACCCGC 

501 GCGACACTGC CTTTCTTTCC CACTTGGTAC AGACCGTATT TTTCCTGCGC 

551 GGCATAGACG TTGTTTCTCC TGCCAATGCC GATACAGATG TGTTTATTAA 

601 CATCGACGTA TTCGGAACGA TACGCAACAG AACCGAAATG. . 



Number 10 ORF 

1 . . GG . CAGCACA AAAAACAGGC 

51 TGCCGGGTAT GATATTCGGC 

101 ATCCCCGCGT TCGGGCTTCA 

151 CGCATTCAAA ACACTGCATA 

2 01 CCGGACTGCC CrGACTGACT 

251 AGCTGGGTCG GCATAGGCGG 

301 CTGCGGCTTC CCCGCCCATA 

351 GGCCGATTGC ACTCTCCGGC 

401 ATTGCAGGAT TGCCCGAAGG 

451 CGCCGTCCTC AGCGCGGCAA 

501 CCGCCCACAA ACTTTCTTCT 

551 TTGCTTTTGA TTGCCGGAAA 



GGTTGAACGG AAAAACCGTA TTTACGATGA 
GTATTCACGG GCGCATTCTC CGCAAAATAT 
AATTTTCTTC ATCCTGTTTT TAACCGCCGT 
CCGACCCTCA GACGGCATCC CGCCCGCTGC 
GCGGTTTCCA CACTGTTCGG CACAATGTCG 
CGGTTCACTT TCCGTCCCCT TCTTAATCCA 
AAGCCATCGG CACATCATCC GGCCTTGCCT 
GCAATATCGT ATCTGCTCAA CGGCCTGAAT 
GTCACTGGGC TTCCTTTACC TGCCCGCCGT 
CCATTGCCTT TGCCCCGCTC GGTGTCAAAA 
GCCAAACTCA AAAAATC.TT CGGCATTATG 
AATGCTGTAC AACCTGCTTT AA 



Number 11 ORF 

1 . . GGAAACGGAT GGCAGGCAGA 

51 CGTCAGTAAT GTATCGATGA 

101 TGCATTATTG CTTTTCGGGA 

151 CTCAAACTTT ATGCGCTGAA 

201 GCTGATGGCG GTTGCCTATG 

251 CGTCAACGTT CGGCGGCTCG 

301 TTGATGCAGG TCTCGGTACT 

351 A 



CCCCGAACAT CCGCTGCTCG GGCTTTTTGC 
CGCTTGCTTT TGTCGGAATA TGTGCGTTGG 
ACGGTTCAAG TGTTTGTGTT TGCGGCACTG 
GCCGGTTTA? TGGTTCGTGT TGCAGTTTGT 
TCCACCGCrS CGGTATAGAC CGGCAGCCGC 
CAGCTGCGAC TCGGCGGGTT GACGGCAGCG 
GGTGCTGCTG CTTTCAGAAA TTGGAAGATA 



Number 12 ORF 

1 ATGAAAACCC CACTCCTCAA GCCTCTGCTN ATTACCTCGC TTCCCGTTTT 

51 CGCCAGTGTT TTTACCGCCG CCTCCATCGT CTGGCAGCTA GGCGAACCCA 

101 AGCTCGCCAT GCCCTTCGTA CTCGGCATCA TCGCCGGCGG CCTTGTCGAT 

151 TTGGACAACC NCNTGACCGG ACGGCTNAAA AACATCATCA CCACCGTCGC 

201 CCTGTTCACC CTCTCCTCGC TCACGGCACA AAGCACCCTC GGCACAGGGC 

251 TGCCCTTCAT CCTCGCCATG ACCCTGATGA CTT.CG.CTT CACCATTTTA 

301 GGCGCGGNCG . . . 



Number 13 ORF 



1 


ATGAATATGC 


TGGGAGCTTT 


GGCAAAAGTC 


GGCAGCCTGA 


CGATGGTGTC 


51 


GCGCGTTTTG 


GGATTTGTGC 


GCGATACGGT 


CATTGCGCGG 


GCATTCGGCG 


101 


CGGGTATGGC 


GACGGATGCG 


TTTTTTGTCG 


CGTTCAAACT 


GCCCAACCTG 


151 


CTTCGCCGCG 


TGTTTGCGGA 


GGGGGCGTTT 


GCCCAAGCGT 


TTGTGCCGAT 


201 


TTTGGCGGAA 


TACAAGGAAA 


CGCGTTCAAA 


AGAGGCGG.C 


GAAGCCTTTA 


251 


TCCGCCATGT 


GGCGGGGATG 


CTGTCGTTTG 


TACTGGTTAT 


CGTTACCGCG 


301 


CTGGGCATAC 


TTGCCGCGCC 


TTGGGTGATT 


TATGTTTCCG 


CACCCGAGTT 
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351 TTGCCCAAGA TGCCGACAAA TTTCAGCTCT CCATCGATTT GCTGCGGATT 

401 ACGTTTCCTT ATATATTATT GATT7CCCTG TCTTCATTTG TCGGCTCGGT 

4 51 ACTCAATTCT TATCATAAGT TCGGCATTCC GGCGTTTACG CCAC.GTTTC 

501 TGAACGTGTC GTTTATCGTA TTCGCGCTGT TTTTCGTGCC GTATTTCGAT 

551 CCGCCCGTTA CCGCGCyGGC GTGGGCGGTC TTTGTCGGCG GCATTTTGCA 

501 ACTCGrmTTC CAACTGCCCT GGCTGGCGAA ACTGGGCTTT TTGAAACTGC 

651 CCAAACtGAG TTTCAAAGAT GCGGCGGTCA ACCGCGTGAT GAAACAGATG 

701 GCGCCTGCgA TTTTgGGCGT GAgCGTGGCG CAGGTTTCTT TGGTGATCAA 

751 CACGATTTTc GCGTCTTATC TGCAATCGGG CAGCGTTTCA TGGATGTATT 

801 ACGCCGACCG CATGATGGAG CTGCCCAGCG GCGTGCTGGG GGCGGCACTC 

851 GGTACGATTT TGCTGCCGAC TTTGTCCAAA CACTCGGCAA ACCaAGATAC 

901 GGaACAGTTT TCCGCCCTGC TCGACTGGGG TTTGCGCCTG TGCATGCtgc 

951 TGACGCTGCC GGCGgcGGTC GGACTGGCGG TGTTGTCGTT cCCgCtGGTG 

1001 GCGACGCTGT TTATGTACCG CGwATTTACG CTGTTTGACG CGCAGATGAC 

1051 GCAACACGCG CTGATTGCCT ATTCTTTCGG TTTAATCGGC TTAATCATGA 

1101 TTAAAGTGTT GGCACCCGGC TTCTATGCGC GGCAAAACAT CAAwAmGCCC 

1151 GTCAAAATCG CCATCTTCAC GCTCATCTGC mCGCAGTTGA TGAACCTTGa 

1201 CTTTAyCGGC CCACTrrAAC rCagTCGGAC TTTCGCTTGC CATCGGTCTG 

1251 GGCGCGTGTA TCAATGCCGG ATTGTTGTTT TACCTGTTGC GCAGACACGG 

1301 TATTTACCAA CCTGG.CAAG GGTTGGGCAG CGTTCTT.AG CAAAAATGCT 

13 51 GcTCTCGCTC GCCGTGA 



Number 14 ORF 

1 atGATTAAAA TCAAAAAAGG TCTAAACCTG CCCATCGCGG GCAGACCGGA 

51 GCAAGCCGTT tACGACGGCC CGGCCaTTAC CGAAGtCGCG TTGCTTGGCG 

101 AAGAATATGC CGGTATGCGC CCCTCGATGA AAGTCAAGGA AGGCGATGCC 

151 GTcAAAAAAG GCCAAGTGCT GTTTGAAGAC AAAAAGAATC CGGGCGTGGT 

201 GTTTACTGCG CCGGCTTCAG GcAAAATCGC CGCGATTCAC CGTGGCGAAA 

251 AGCGCGTACT TCAGTCAGTC GTGATTGCCG TTGAArGCAA CGACGAAATC 

301 GAGTTTGAAC GCTACGCACC TGAAGCGCTG GCAAACTTAA GCGGCGAAGA 

351 AGTGCGCCGC AACCTGATCC AATCCGGTTT GTGGACTGCG CTGCGCACCC 

401 GTCCGTTCAG CAAAATTCCT GCCGTCGATG CCGAGCCGTT CGCCATCTTC 

451 GTCAATGCGA tGGACACCAA TCCG. . 



Number 15 ORF 

1 . . GCGnCGnAAA TCATCCATCC CC . . nACGTC GTAGGCCCTG AAGCCAACTG 

51 GTTTTTTATG GTAGCCAGTA CGTTTGTGAT TGCTTTGATT GGTTATTTTG 

101 TTACTGAAAA AATCGTCGAA CCGCAATTGG GCCCTTATCA ATCAGATTTG 

151 TCACAAGAAG AAAAAGACAT TCGGCATTCC AATGAAATCA CGCCTTTGGA 

201 ATATAAAGGA TTAATTTGGG CTGGCGTGGT GTTTGTTGCC TTATCCGCCC 

251 TATTGGCTTG GAGCATCGTC CCTGCCGACG GTATTTTGCG TCATCCTGAA 

301 ACAGGATTGG TTTCCGGTTC GCCGTTTTTA AAATCGATTG TTGTTTTTAT 

351 TTTCTTGTTG TTTGCACTGC CGGGCATTGT TTATGGCCGG GTAACCCGAA 

401 GTTTGCGCGG CGAACAGGAA GTCGTTAATG CGmyGGCCGA ATCGATGAGT 

451 ACTCTGGsGC TTTmTTTGsw CAk.cATCTTT TTTGCCGCAC AGTTTGTCGC 

501 ATTTTTTAAT TGGACGAATA TTGGGCAATA TATTGCCGTT AAAGGGGCGA 

551 CGTTCTTAAA AGAAGTCGGC TTGGGCGGCA GCGTGTTGTT TATCGGTTTT 

601 ATTTTAATTT GTGCTTTTAT CAATCTGATG ATAGGCTCCG CCTCCGCGCA 

651 ATGGGCGGTA ACTGCGCCGA TTTTCGTCCC TATGCTGATG TTGGCCGGCT 

701 ACGCGCCCGA AGTCATTCAA GCCGCTTACC GCATCGGTGA TTCCGTTACC 

751 AATATTATTA CGCCGATGAT GAGTTATTTC GGGCTGATTA TGGCGACGGT 

801 GrkCmmmTAC AAAAAAGATG CGGGCGTGGG TaCGcTGATT wCTATGATGT 

851 TGCCGTATTC CGCTTTCTTC TTGATTGCgT GGATTGCCTT ATTCTGCATT 

901 TGGGTATTTg TTTTGGGCGT GCCCGTCGGT CCCGGCGCGC CCACATTCTA 

951 TCCCGCACCT TAA 



Number 16 ORF 

1 . . ACAGCCGGCG CAGCAGGTTn CnCGGTCTTC GTTTTCGTAA CGGACAGTCA 
51 GGTGGAGGTG TTCGGGAACA TCCAGACCGC AGTGGAAACA GGTTTTTTTC 
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101 ATGGCATTTC GGTTTCGTCT GTGTTTGGTG CGGCGGCACA AGACTCGGCA 

151 ATgGCTTCGC GCAGTGCGTC TATACCGGTA TTTTCAGCAA CGGAAATGCG 

201 GACGGcGgCA ATTTTTCCCG CAGCGTCGCG CCATATGCCC GTGTTTTgTT 

251 CTTCAGACGG CAGCAGGTCG GTTTTGTTGT ACACCTTgAT GCACGGAaTA 

301 TCGCCGGCAT GGATTTCTTG CAGTACGTTT TCCACGTCTT CAATCTGCTG 

351 TCCGCTGTTC GGAGCGGCGG CATCGACGAC GTGCAGCAGC ACATCgGcTT 

401 gCGCGGTTTC TTCCAGCGTG GCgGAAAAGG CGGAAATCAG TTTgTGCGGC 

4 51 agATyGCTnA CGAATCCGAC GGTATCGGTC AGGATAATGC TGCATTCGGG 

501 ACT.. 



Number 17 ORF 

1 ..GGCCATTACT CCGACCGCAC TTGGAAGCCG CGTTTGGNCG GCCGCCGTCT 

51 GCCGTATCTG CTTTATGGCA CGCTGATTGC GGTTATTGTG ATGATTTTGA 

101 TGCCGAACTC GGGCAGCTTC GGTTTCGGCT ATGCGTCGCT GGCGGCTTTG 

151 TCGTTCGGCG CGCTGATGAT TGCGCTGTTA GACGTGTCGT CAAATATGGC 

201 GATGCAGCCG TTTAAGATGA TGGTCGGCGA CATGGTCAAC GAGGAGCAGA 

251 AAA.NTACGC CTACGGGATT CAAAGTTTCT TAGCAAATAC GGGCGCGGTC 

301 GTGGCGGCGA TTCTGCCGTT TGTGTTTGCG TATATCGGTT TGGCGAACAC 

351 CGCCGANAAA GGCGTTGTGC CGCAGACCGT GGTCGTGGCG TTTTATGTGG 

401 GTGCGGCGTT GCTGGTGATT ACCAGCGCGT TCACGATTTT CAAAGTGAAG 

451 GAATACGANC CGGAAACCTA CGCCCGTTAC CACGGCATCG ATGTCGCCGC 

501 GAATCAGGAA AAAGCCAACT GGATCGCACT CTTAAAA.CC GCGC. 



Number 18 ORF 

1 ATGTTGTTCC GTAAAACGAC 

51 GAACGGCTGT ACGTTGATGT 

101 CAATCACCCG NAAACACGTT 

151 GTTGCCGAAG ACAATGCCCA 

201 CGGAAAATAC TGGTTCGTCG 

251 GNATTTTGAN GGCAGGGCTG 

301 CCGAGCTATG C . TGCCACCA 

351 CAGCCAGAAT . . . 



CGCCGCCGTT TTGGCGCATA CCTTGATGCT 
TGTGGGGAAT GAACAACCCG GTCAGCGAAA 
GNCAAAGACC AAATCCGNGN CTTCGGTGTG 
ATTGGAAAAG GGCAGCCTGG TGATGATGGG 
TCAATCCCGA AGATTCGGCG AA.NTGACGG 
GACAAACCCT TCCAAATAGT TNAGGATACC 
AGCCCTGCCG GTCAAACTCG GATCGNCTGG 



Number 19 ORF 

1 . . GTCAGTCCTG TACTGCCTAT 

51 TATCGGTTAT GAAACCCATT 

101 CGTTCGATCA TCATGATTCA 

151 GACGGCGGTT TTACTGTTTA 

201 TCCGGAGGAT GAATATGACG 

251 GAGGAGCAAG GGATATATAC 

301 ACAAAGACTA GTATTGTCCC 

351 AGAAAATGCC GGTGCCGCCT 



TACACACGAA CGGACAGGGT TTGAAGGTGT 
TTTCAGGGCA CGGACATGAA GTACACAGTC 
AAAAGCACTT CTGATTTCAG CGGCGGTGTA 
CCAACTTCAT CGAACATGGT CGGAAATCCA 
GGCCGCAAGC AGCG.ATTAT CCGCCCCCCG 
AGCTATTATG TCAAAGGAAC TTCAACAAAA 
TCAAGCCCCA TTTTCAGACC GTTGGCTAGA 
CTGGT. . 



Number 20 ORF 

1 ATGAAAAAAC AAATCACCGC 
51 CGCAATGGCA AACGGCTTGG 
101 ACACGCGGGC AGATGCACCG 



AGCCGTAATG ATGCTGTCTA TGATTGCCCC 
ACAATCAGGC ATTTGAAGAC CAAATGTTCC 
ATGCAG. . . 



Number 21 ORF 

1 ATGAATAAAA CTCTCTATCG 

51 GrTAGCCGTT GCTGAAACTA 

101 GTGATTCAGG CAGCGCTCAT 

151 GCACCTGTTT GTg.CGTTaC 

201 TTCTTTATGT TTGGCTGTAG 



T3TAATTTTC AACCGCAAAC GTGGGGCTGT 
CCAAGCGCGA AGGTAAAAGC TGTGCCGATA 
GTGAAATCTG TTCCTTTTGG TACTACTCAT 
AAATATCTTT TCTTTTTCTT TATTGGGCTT 
GtacGGyCAA TATTGCTTTT GCTGATGGCA 
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251 TT.. 



Number 22 ORF 

1 ATGAATACTC CTCCTTTTGT CTGTTGGATT TTTTGCAAGG TCATCGACAA 

51 TTTCGGCGAC ATCGGCGTTT CGTGGCGGCT CGCCCGTGTT TTGCACCGCG 

101 AACTCGGTTG GCAGGTGCAT TTGTGGACGG ACGATGTGTC CGCCTTGCGT 

151 GCGCTTTGCC CTGATTTGCC CGATGTTCCC TGCGTTCATC AGGATATTCA 

201 TGTCCGCACT TGGCATTCCG ATGCGGCAGA TATTGATACC GCG. . 



Number 23 ORF 

1 . . TTGTTCCTGC GTGTNAAAGT 

51 GTTTCGGGNC ARAGACCCTG 

101 ACGAGTGGCG GCA.ACTTCG 

151 AGCCTGTGGC TCTGCACGCT 

201 GCTTTTGGTG CGGCAATATA 

251 ATGCCGCTTC GGTACGCGCG 

301 CTCGGTTTCC CTGTCCCCGA 

351 CGGCAATATT GCCGATGCGC 

401 TCGCCTGCTR NGGCATCCTG 



GGGGCGTTTT TTCAGCAGTC CGGCGACGTG 
TAAATCAGGC GGTGTTGCGG CTGTATNCGG 
GTACGTTGGA AAATAGNCGC AACGTCGCAC 
GCTCGGAATG CTGGTGTCGG TATTGTTGCT 
CGTTCAACTG GGAAAGCACG CTGTTGAGCA 
GTGGAAATGT TGGCATGGCT GCCGTCGAAA 
TGCGCGGTCG GTCATCGAAG GCCGTCTGAA 
GGGCTTGGTC GGGGCTGCTG GTCGNCAGTA 
CCGCGCCTG . . 



Number 24 ORF 

1 . . CAGAAGAGTT 

51 CGGGGTGTCC 

101 CCTGTTTTTC 

151 GGCAGTACGG 

201 CGTCCGGCTG 

251 CCCGGTTTTT 

301 TCTGTGCCGT 

351 GGGTTGGGCG 

401 GTTTCGCGGG 

451 GTCC. 



TGTCGAGAAT TTCTTTATGG 
GGTCTGGTAT GGTTTTCTTT 
GGGTGTTTCT TTTCGGGGTT 
GGGTTTCTTT GAGTGTGTTT 
CCTGTCGGTT TGAGCTGTGT 
CTTGGGTGCG GCAGGGGACG 
CCGGCTGTGC GGGTTCGGAT 
GCATCTTGTT CCGACTACGC 
GGCTGTCGGT GTGTTGCGGT 



GGTTTGGGCG GCGTGTTTTT 
GGGCGTTTCT TT . GAGTGCG 
CGGGACGGGG GACGTTTGTG 
TCAGCTTGTG TTCC.GGCGT 
CGGCAGGTTG CG . . GTTTGA 
TCATTCTCCT GCCGCTTTCG 
GAGGCGGCGT GGTGGTGTTC 
CGTTTGGCAG CCAGAATTCG 
TCGGCTTGAA GGGTTTTGTC 



Number 25 ORF 

1 ATGAAAACCT TCTTCAAAAC 

51 CGCCGCCTGC GGATT.CAAA 

101 CCGCCGCCGA CAACGGCGCG 

151 CGTCGGCGAC TTCGGCGATA 

2 01 AGAAAAAAGG CTACACCGTC 

251 CCGAATCTGG CATTGGCTGA 



CCTTTCCGCC GCCGCACTCG CGCTCATCCT 
AAGACAGCGC GCCCGCCGCA TCCGCTTCTG 
GCGTAAAAAA GAAATCGTCT TCGGCACGAC 
TGGTCAAAGA ACAAATCCAA GCCGAGCTGG 
AAACTGGTCG AGTTTACCGA CTATGTACGC 
GGGCGAGTTG 



Number 26 ORF 

1 CCTCGTCGTC CTCGGCATGC TCCAGTTTCA AGGGGCGATT TACTCCAAGG 

51 CGGTGGAACG TATGCTCGGC ACGGTCATCG GGCTGGGCGC GGGTTTGGGC 

101 GTTTTATGGC TGAACCAGCA TTATTTCCAC GGCAACCTCC TCTTCTACCT 

151 CACCGTCGGC ACGGCAAGCG CACTGGCCGG CTGGGCGGCG GTCGGCAAAA 

201 ACGGCTACGT CCCTmTGCTG GCAGGGCTGA CGATGTGTAT GCTCATCGGC 

251 GACAACGGCA GCGAATGGCT CGACAGCGGA CTCATGCGCG CCATGAACGT 

301 GCTCATCGGC GyGGCCATCG CCATCGCCGC CGCCAAACTG CTGCCGCTGA 

351 AATCCACACT GATGTGGCGT TTCATGCTTG CCGACAACCT GGCCGACTGC 

4 01 AGCAAAATGA TTGCCGAAAT CAGCAACGGC AGGCGCATGA CCCGCGAACG 

4 51 CCTCGAGGAG AACATGGCGA AAATGCGCCA AATCAACGCA CGCATGGTCA 

501 AAAGCCGCAG CCATCTCGCC GCCACATCGG GCGAAAGCTG CATCAGCCCC 

551 GCCATGATGG AAGCCATGCA GCACGCCCAC CGTAAAATCG TCAACACCAC 

501 CGAGCTGCTC CTGACCACCG CCGCCAAGCT GCAATCTCCC AAACTCAACG 



wo 00/22430 



-7- 



PCT/US99/23573 



651 GCAGCGAAAT CCGGCTGCTT GACCGCCACT TCACACTGCT CCAAAC 

701 GC AGACACGCCC GCCGCATCCG 

751 CATCGACACC GCCATCAACC CCGAACTGGA AGCCCTCGCC GAACACCTCC 

801 ACTACCAATG GCAGGGCTTC CTCTGGCTCA GCACCGATAT GCGTCAGGAA 

851 ATTTCCGCCC TCGTCATCCT GCTGCAACGC ACCCGCCGCA AATGGCTGGA 

901 TGCCCACGAA CGCCAACACC TGCGCCAAAG CCTGCTTGA 



Number 27 ORF 

1 . . GAAATCAGCC TGCGGTCCGA CNACAGGCCG GTTTCCGTGN CGAAGCGGCG 

51 GGATTCGGAA CGTTTTCTGC TGTTGGACGG CGGCAACAGC CGGCTCAAGT 

101 GGGCGTGGGT GGAAAACGGC ACGTTCGCAA CCGTCGGTAG CGCGCCGTAC 

151 CGCGATTTGT CGCCTTTGGG CGCGGAGTGG GCGGAAAAGG CGGATGGAAA 

201 TGTCCGCATC GTCGGTTGCG CTGTGTGCGG AGAATTCAAA AAGGCACAAG 

251 TGCAGGAACA GCTCGCCCGA AAAATCGAGT GGCTGCCGTC TTCCGCACAG 

301 GCTTT.GGCA TACGCAACCA CTACCGCCAC CCCGAAGAAC ACGGTTCCGA 

351 CCGCTGGTTC AACGCCTTGG GCAGCCGCCG CTTCAGCCGC AACGCCTGCG 

4 01 TCGTCGTCAG TTGCGGCACG GCGGTAACGG TTGACGCGCT CACCGATGAC 

451 GGACATTATC TCGGAGA.GG AACCATCATG CCCGGTTTCC ACCTGATGAA 

501 AGAATCGCTC GCCGTCCGAA CCGCCAACCT CAACCGGCAC GCCGGTAAGC 

551 GTTATCCTTT CCCGACCGG. . 



Number 28 ORF 

1 ATGTTTTACC AAATCCTTGC CCTGATTATC TGGAGCAGCT CGTTTATTGC 

51 CGCC7WVTAT GTCTATGGCG GCATCGATCC CGCATTGATG GTCGGCGTGC 

101 GCCTGCTAAT TGCCGCGCTG CCTGCACTGC CCGCCTGCCG CCGTCATGTC 

151 GGCAAGATTC CGCGTGAGGA ATGGAAGCCG TTGCTGATTG TGTCGTTCGT 

201 CAACTATGTG CTGACCCTGC TGCTTCAGTT TGTCGGGTTG AAATACACTT 

251 CCGCCGCCAG CGCATCGGTC ATTGTCGGAC TCGAGCCGCT GCTGATGGTG 

301 TTTGTCGGAC ACTTTTTCTT CAACGACAAA GCGCGTGCCT ACCACTGGAT 

351 ATGCGGCGCG GCGGCATTTG CCGGTGTCGC GCTGCTGATG GCGGGCGGTG 

4 01 CGGaAGAGGG CGGCGaAGTC GGCTGGTTCG GCTGCCTGCT GGTGTTGTTG 

4 51 GCGGGCGCGG GCTTTTGTGC CGCTATGCGT CCGACGCAAA GGCTGATTGC 

501 ACGCATCGGC GCACCGGCAT TCACATCTGT TTCCATTGCC GCCGCATCGT 

551 TGATGTGCCT GCCGTTTTCG CTTGCTTTGG CGCAAAGTTA TACCGTGGAC 

601 TGGAGCGTCG GGATGGTATT GTCGCTGCTG TATTTGGGTT TGGGGTGC. 



Number 29 ORF 

1 ATGCGCCGTT TTCTACCGAT 

51 sGGACTGACG GCGGCAACCG 

101 GGTGGATTGT TGCGTTCAGC 

151 TTGGCACGTT ATGTCATATT 

201 CGGTTCG£tA srTyGCCAAA 

251 GCCGkACTGC CCGGCGTGTT 

301 CGGCACGATT AATTCGTGGT 

351 GCAGCCTCAA TTTGAGCAAG 

4 01 CTCGGCAACG CCGTCCCCGT 

4 51 GCCCGGGGAT ATGGGCAGGG 

501 CCCAGCTTGC CCTGTACAAy 

551 AACCCGCACA AGCTCGATCA 

601 AATCCaACGG GCGGGTTCGG 

651 TGTaCGCGCA GGGCTGGCTG 

701 GCCTTGTTTT TCCGTCAGCC 

751 yTTAATCGAA AAGGCAAGGG 

801 AAGGTTTGCA GACCTTTTTC 

851 TCGATTTTTC TTGCACTGGT 

901 CGAACCCGTC CTATCGCTTG 

951 ATTTCAGCCA GACGCGCCCC 

1001 ACCArGTTGT TCAACCACAT 

1051 AGACGAGCGC AACCGCCGGC 



CGCAGCCATA TGCGCmGwms TCCTGkkGTA 
GCAGCACCAG TTCGCTGGCG GATTATTTCT 
GCAATGCTGC TGCTGGTGTT GTCCGCCGTT 
GCTGTTGAAA GACAGGCGCG ACGGCGTATT 
gsGCCTgkks TGGG.ATGTT TACGCTGGTT 
TCTGTTCGGC TTTCCCGCAC AGTTCATCAA 
TCGGCAACGA TACCCACGAG GCGCTTGAAC 
TCCGCATTGA ATTTGGCGGC AGACAACGCC 
GCAGATAGAC CTCATCGGCG CGGCTTCCCT 
TGCTGGAACA TTACGCCGGC AGCGGTTTTG 
ksCGCAAGCG GCAAAATCGA AAAAAGCATC 
GCCGTTTCCA GGTAAGGCGC GTTGGGAaAa 
TCAGGGATTT GGAAAGCATA GGCGGCGTAT 
TCGGCGGGTA CGCACwACGG GCGCGATTAC 
GGTTCCCAAA GGCGTGGCAG AGGATGCCGT 
CGAAATATGC TGAGTTGAGT TACAGCAAAA 
CTGGCAACCC TGCTGATTGC GTCGCTGCTG 
CATGGCACTG TATTTCGCCC GCCGTTTCGT 
CCGAGGGGGC GAAGGCGGTG GCGCAAGGCG 
GTGTTGCGCA ACGACGAGTT CGGACGCTTG 
GACCGAGCAG CTTTCCATCG CCAAAGATGC 
GCGAGGAAGC CGCCAGGCAT TATCTTGAAT 



wo 00/22430 



-8- 



PCT/US99/23573 



llDl GCGTGTTGGA GGGGCTGACC ACGGGCGTGG TGGTGTTTGA CGAACAAGGC 
1151 TGTCTGAAAA CCTTCAACAA AGCGGCGGGT ACC. . 



Number 30 ORF 

1 ATGTACGCAT TTACCGCCGC ACAGCAACAG AAGGCACTCT TCCGGCTGGT 

51 GCTTTTTCAT ATCCTCATCA TCGCCGCCAG CAACTATCTG GTGCAGTTCC 

101 CTTTCCAAAT TTTCGGCATC CACACCACTT GGGGCGCATT TTCCTTTCCC 

151 TTCATCTTCC TTGCCACCGA CCTGACCGTC CGCATTTTCG GTTCTCACTT 

201 GGCACGGCGG ATTATCTTTT GGGTGATGTT CCCCGCCCTT TTGCTTTCCT 

251 ACGTCTTTTC CGTTTTGTTC CACAACGGCA GTTGGACAGG CTTGGGCGCG 

301 CTGTCCGAAT TCAACACCTT TGTCGGACGC ATCGCCTTAG CCAGCTTTGC 

351 CGCCTACGCG ATCGGACAAA TCCTTGATAT TTTTGTATTC AACAAATTAC 

401 GCCGTCTGAA AGCGTGGTGG ATTGCACCGA ACGCATCAAC CGTCATCGGG 

451 CACGCGTTGG ATACG. . . ■ 



Number 31 ORF 

1 ATGGTCATAA AATATACAAA TTTGAATTTT GCGAAATTGT CGATAATTGC 

51 AATTTTGATG ATGTATTCGT TTGAAGCGAA TGCAAAyGCA GTmwrAATAT 

101 CTGAAACTGT TTCAGTTGAT ACCGGACAAG GTGCGAAAAT TCATAAGTTT 

151 GTACCTAAAA ATAGTAAAAC TTATTCATCT GATTTAATAA AAACGGTAGA 

201 TTTAACACAC AyyCCTACGG GCGCAAAAGC CCGAATCAAC GCCAAAATAA 

251 CCGCCAGCGT ATCCCGCGCC GGCGTATTGG CGGGGGTCGG CAAACTTGCC 

301 CGCTTAGgCG CGAAATTCAG CACAAGGGCG GTtCCCTATG TCGGAACAGC 

351 CcTTTTAGCC CACGACGTAT ACGAAAcTTT CAAAGAAGAC ATACAGGCAC 

401 GAGGCTACCA ATACGACCCC GAAACCGACA AATTTGTAAA AGGCTACGAA 

451 TATAGTAATT GCCTTTGGTA CGAAGACAAA AGACGTATTA ATAGAACCTA 

501 TGGCTGCTAC GGCGTTGAT. . 



Number 32 ORF 

1 ATGAGATTTT TCGGTATCGG TTTTTTGGTG CTGCTGTTTT TGGAGATTAT 

51 GTCGATTGTG TGGGTTGCCG ATTGGCTGGG CGGCGGCTGG ACGTTGTTTT 

101 TGATGGCGGC AGGTTTTGCC GCCGGCGTGC TGATGCTCAG GCAAACCGGG 

151 SCTGACCGGT CTTTTATTGG CGGGCGCGGC AJiTGAGAAGC GGCGGGAAGG 

201 TATCCGTTTA TCAGATGTTG TGGCCTATC. . 



Number 33 ORF 

1 ATGTTTGTTT TTCAGACGGC ATTCTT.ATG TTTCAGAAAC ATTTGCAGAA 

51 AGCCTCCGAC AGCGTCGTCG GAGGGACATT ATACGTGGTT GCCACGCCCA 

101 TCGGCAATTT GGCGGACATT ACCCTGCGCG CTTTGGCGGT ATTGCAAAAG 

151 GCG GCCGA AGACACGCGC GTTACCGCAC AGCTTTTGAG 

201 CGCGTACGGC ATTCAGGGCA AACTCGTCAG TGTGCGCGAA CACAACGAAC 

251 GGCAGATGGC GGACAAGATT GTCGGCTATC TTTCAGACGG CATGGTTGTG 

301 GCACAGGTTT CCGATGCGGG TACGCCGGCC GTGTGCGACC CGGGCGCGAA 

351 ACTCGCCCGC CGCGTGCGTG AGGCCGGGTT TAAAGTCGTT CCCGTCGTGG 

401 GCGCAAC.GC GGTGATGGCG GCTTTGAGCG TGGCCGGTGT GGAAGGATCC 

451 GATTTTTATT TCAACGGTTT TGTACCGCCG AAATCGGGAG AACGCAGGAA 

501 ACTGTTTGCC AAATGGGTGC GGGCGGCGTT TCCTATCGTC ATGTTTGAAA 

551 CGCCGCACCG CATCGGTGCA GCGCTTGCCG ATATGGCGGA ACTGTTCCCC 

601 GAACGCCGAT TAATGCTGGC GCGCGAAATT ACGAAAACGT TTGAAACGTT 

651 CTTAAGCGGC ACGGTTGGGG AAATTCAGAC GGCATTGTCT GCCGACGGCG 

7 01 ACCAATCGCG CGGCGAGATG GTGTTGGTGC TTTATCCGGC GCAGGATGAA 

7 51 AAACACGAAG GC7TGTCCGA GTCCGCGCAA AACATCATGA AAATCCTCAC 

801 AGCCGAGCTG CCGACCA.aAC AGGCGGCGGA GCTTGCTGCC AAAATCACGG 

851 GCGAGGGAAA GAAAGCTTTG TACGAT.. 
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Number 34 ORF 

1 ATGAAACAGA AAAAAACCGC 
51 TTTTGCGGCA GC.AAAGCAC 

651 GAGTTGG 

701 AGGAAAAAGC CCGCTTGAAA 
751 AAACCGTAA 



TGCCGCAGTT ATTGCTGCAA TGTTGGCAGG 

CCGAAATCGA CCCGGCTTTG 

// 

TCAGAAACCA GTTGGAGCAG GGTTTGAGAC 
ATCGATGCCC TTTTGGAAGA AAACGGTGTC 



Number 35 ORF 

1 ATGAAAAAAT CTTTCCTTAC GCTTGTTCTG TATTCGTCTT TACTTACCGC 

51 CAGCGAAATT GCCTTACCCC TTGGAATTGG GGATTGAAAC CTTACCGGCG 

101 GCAAAAATTG CGGAAACGTT TGCGCTGACA TTTGTGATTG CTGCGCTGTA 

151 TCTGTTTGCG CGTAATAAGG TGACGCGTTT GTTGATTGCG GTGTTTTTTG 

201 CGTTCAGCAT TATTGCCAAC AATGTGCATT ACGCGGATTA TCAAAGCTGG 

251 ATGACG 

// 

1201 CAAACCGTAT TCGAGCAGCT GCAAAAGACT CCTGACGGCA 

1251 ACTGGCTGTT TGCCTATACC TCCGATCATG GCCAGTATGT TCGCCAAGAT 

1301 ATCTACAATC AAGGCACGGT GCAGCCCGAC AGCTATCTCG TGCCGCTAGT 

1351 GTTGTACAGC CCGGATAAGG CCGTGCAACA GGCTGCCAAC CAGGCTTTTG 

14 01 CGCCTTGCGA GATTGCCTTC CATCAGCAGC TTTCAACGTT CCTGATTCAC 

1451 ACGTTGGGCT ACGATATGCC GGTTTCAGGT TGTCGCGAAG GCTCGGTAAC 

1501 GGGCAACCTG ATTACGGGTG ATGCAGGCAG CTTGAACATT CGCGACGGCA 

1551 AGGCGGAATA TGTTTATCCG CAATGA 



Number 36 ORF 



1 


. . .ACCCTGCTCC 


TCTTCATCCC 


CCTCGTCCTC 


ACAC.GTGCG 


GCACACTGAC 


51 


CGGCATACTC 


GCCCaCGGCG 


GCGGCAAACG 


CTTTGCCGTC 


GAACAAGAAC 


101 


TCGTCGCCGC 


ATCGTCCCGC 


GCCGCCGTCA 


AAGAAATGGA 


TTTGTCCGCC 


151 


yTAAAAGGAC 


GCAAAGCCGC 


CyTTTACGTC 


TCCGTTATGG 


GCGACCAAGG 


201 


TTCGGGCAAC 


ATAAGCGGCG 


GACGCTACTC 


TATCGACGCA 


CTGATACGCG 


251 


GCGGCTACCA 


CAACAACCCC 


GAAAGTGCCA 


CCCAATACAG 


CTACCCCGCC 


301 


TACGACACTA 


CCGCCACCAC 


CAAATCCGAC 


GCGCTCTCCA 


GCGTAACCAC 


351 


TTCCACATCG 


CTTTTGAACG 


CCCCCGCCGC 


CGyCyTGACG AAAAACAGCG 


401 


GACGCAAAGG 


CGAACGcTCC 


GCCGGACTGT 


CCGTCAACGG 


CACGGGCGAC 


451 


TACCGCAACG 


AAACCCTGCT 


CGCCAACCCC 


CGCGACGTTT 


CCTTCCTGAC 


501 


CAACCTCATC 


CAAACCGTCT 


TCTACCTGCG 


CGGCATCGAA GTCgTACCGC 


551 


CCGrATACGC 


CGACACCGAC 


GTATTCGTAA 


CCGTCGACGT 


A. . . 



Number 37 ORF 



ATGGCAGAGA TCTGTTTGAT AACCGGCACG CCCGGTTCAG GGAAAACATT 
AAAAATGGTT TCCATGATGG CGAATGATGA AATGTTTAAG CCTGATGAAA 
AAGCCATACG CCGTAAAGTA TTTACGAACA TAAAAGGCTT GAAAATACCG 
CACACCTACA TAGAAACGGA CGCAAAAAAG CTGCCGAAAT CGACAGATGA 
GCAGCTTTCG GCGCATGATA TGTACGAATG GATAAAGAAG CCCGAAAATA 
TCGGGTCTAT TGTCATTGTA GATGAAGCTC AAGACGTATG GCCGGCACGC 
TCGGCAGGTT CAAAAATCCC TGAAAATGTC CAATGGCTGA ATACGCACAG 
ACATCAGGGC ATTGATATAT TTGTTTTGAC TCAAGGTCCT AAGCTTCTAG 
ATCAAAATCT TAGAACGCTT GTACGGAAAC ATTACCACAT CGCTTCAAAC 
AAGATGGGTA TGCGTACGCT TTTAGAATGG .'UiAATATGCG CGGACGATCC 
CGTAAAAATG GCATCAAGCG CATTCTCCAG TATCTATACA CTGGATAAAA 
AAGTTTATGA CTTGTAysrr TmmGCGGAAG TTCATACCGT AAATAAGGTC 
AAGCGGTCAA AGTGGTTTTA CACTCTGCCa GTAATAGTAT TGCTGATTCC 
CGTGTTTGTC GGCCTGTCCT ATAAAATGTT GagCaGTTAC GGAAAAAAAC 
aGGAAGAACC CGCAGCACAA GAATCGGCGG CAACAGAACA GCAGGCAGTA 
CTTCCGGATA AAACAGAAGG CGAGCCGGTA AATAACGGCA ACCTTACCGC 
AGATATGTTT GTTCCGACAT TGTCCGAaAA ACCCGrAAGC AAGCcgaTTT 
ATAACGGTGT AAGGCAGGTA AGAACCTTTG AATATATAGC AGGCTGTATA 
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901 GAAGGCGGAA GAACCGGATG CGCCTGCTAT TCGCaTCAAG GGACGGCATt 

951 gaAAGAAGTG ACGGaGTTGA TGTGc£aAgG aCTATGTaAA AAacGGCTTG 

1001 CCGTTTAACC CaTACAAAGA AGAAAGCCAA GGGCAGGAAG TTCAGCAAAG 

1051 CGCGCAgCAA CATTCGGACA GGGCGgCAAG TTGCCACATT GGGCGGAAAA 

1101 CCGTAGCAGA ACCTAATGTA CGATAATTGG GAAGAACGCG GGAAACCGTT 

1151 TGAAGGAATC GGaCGGGGGC GTGGTCGGAT CGGCAAACTG A 



Number 38 ORF 

1 GTGGTTTTCC TGAATGCCGA CAACGGGATA TTGGTTCAGG ACTTGCCTTT 

51 TGAAGTCAAA CTGAAAAAAT TCCATATCGA TTTTTACAAT ACGGGTATGC 

101 CGCGTGATTT CGCCAGCGAT ATTGAAGTGA CGGACAAGGC AACCGGTGAG 

151 AAACTCGAGC GCACCATCCG CGTGAACCAT CCTTTGACCT TGCACGGCAT 

201 CACGATTTAT CAGGCGAGTT TTGCCGACGG CGGTTCGGAT TTGACATTCA 

251 AGGCGTGGAA TTTGGGTGAT GCTTCGCGCG AGCCTGTCGT GTTGAAGGCA 

301 ACATCCATAC ACCAGTTTCC GTTGGAAATT GGCAAACACA AATATCGTCT 

351 TGAGTTCGAT CAGTTCACTT CTATGAATGT GGAGGACATG AGCGAGGGCG 

401 CGGAACGGGA AAAAAGCCTG AAATCCACGC TGCCCGATGT CCGCGCCGTT 

4 51 ACTCAGGAAG GTCACAAATA CACCAAT TACCG 

501 TATCCGTGAT GCGCCAGGCC AGGCGGTCGA ATATAAAAAC TATATGCTGC 

551 CGGTTTTGCR GGAACAGGAT TATTTTTGGA TTACCGGCAC GCGCAGCGC . 

601 TTGCAGCAGC AATACCGCTG GCTGCGTATC CCCTTGGACA AGCAGTTGAA 

651 AGCGGACACC TTTATGGCAT TGCGTGAGTT TTTGAAAGAT GGGGAAGGGC 

701 GCAAACGTCT . GTTGCCGAC GCAACCAAAG GCGCACCTGC CGAAATCCGC 

751 GAACAATTCA TGCTGGCTGC GGAAAACACG CTGAACATCT TTGCACAAAA 

801 AGGCTATTTG GGATTGGACG AATTTATTAC GTCCAATATC CCGAAAGAGC 

851 AGCAGGATAA GATGCAGGGC TATTTCTACG AAATGCTTTA CGGCGTGATG 

901 AACGCTGCTT TGGATGAAAC CAT.ACCCGG TACGGCTTGC CCGAATGGCA 

951 GCAGGATGAA GCGCGGAATC GTTTCCTGCT GCACAGTATG GATGCGTACA 

1001 CGGGTTTGAC CGAATATCCC GCGCCTATGC TGCTGCAACT TGATGGGTTT 

1051 TCCGAGGTGC GTTCGTCGGG TTTGCAGATG ACCCGTTCCC C.GGTCCGCT 

1101 TTTGGTC7AT CTC. . . 



Number 39 ORF 

1 ATGATGAGTA ATAmAATGGm ACAAAAAGGG TTTACATTGA TTGmGmTGAT 

51 GATAGTCGTC GCGATACTCG GCATTATCAG CGTCATTGCC ATACCTTCTT 

101 ATCmAAGTTA TATTGAAAAA GGCTATCAGT CCCAGCTTTA TACGGAGATG 

151 GyCGGTATCA ACAATATTTC CAAACAGTTT ATTTTGAAAA ATCCCCTGGA 

201 CGATAATCAG ACCATCGAGA ACAAACTGGA AATATTTGTC TCAGGCTATA 

251 AGATGAATCC GAAAATTGCC AAAAAaTATA GTGTTTCGGT AAAGTTTGTC 

301 GATAAGGAAA AATCAAGGGC ATACAGGTTG GTCGGCGTTC CGAAGGCGGG 

351 GACGGGTTAT ACTTTGTCGG TATGGATGAA CAGCGTGGGC GACGGATACA 

4 01 AATGCCGTGA TGCCGCTTCT GCCCAAGCCC ATTTGGAGAC CTTGTCCTCA 

451 GATGTCGGCT GTGAAGCCTT CTCTAATCGT AAAAAATAA 



Number 40 ORF 

1 ATGAAAAAAT CCTCCCTCAT CAGCGCATTG GGCATCGGTA TTTTGAGCAT 

51 CGGCATGGCA TTTGCCGCCC CTGCCGACGC GGTAAGCCAA ATCCGTCAAA 

101 ACGCCACTCA AGTATTGAGC ATCTTAAAAA ACGGCGATGC CAACACCGCT 

151 CGCCAAAAAG CCGAAGCCTA TGCGATTCCC TATTTCGATT TCCAACGTAT 

201 GACCGCATTG GCGGTCGGCA ACCCTTGGsG CACCG.GTCC GACG.GCAAA 

251 AACAAGCGTT GGCCn.AGAA TTTCAACCC. . . 



Number 41 ORF 

1 ATGAAACACA TACTCCCCCT 

51 CGCTTCGGCA CATCCTGCCA 

101 TGATCACGCA TACCCTCATC 

151 nnnnnnnnnn nnGCCATAAA 



GATTGCCGCA TCCGCACTCT GCATTTCAAC 
GCGAACCGTC CACTCAAAAC GAAACCGCTA 
TCAAAATACA GTTTTGGnnn nnnnnnnnnn 
AAGCAAAGGG ATGGACATTT TTGCCGTCAT 
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201 CGACCATCAG GflAGCCGCAC GCCGAAACGG CTTAACGATG CAGCCGGCAA 

251 AAGTCATCGT CTTCGGCACG CCCAAAGCCG 3CACGCCGCT GATGGTCAAA 

301 GACCCCGCCT TCGCCCTGCA ACTGCCCCTA C3CGTCCTCG TTACCGAAAC 

351 GGACGGCAAA GTACGCGCCG CCTATACCGA TACGCGCGCC CTCATCGCCG 

401 GCAGCCGCAT CGGTTTCGAC GAAGTGGCAA ACACTTTGGC AAACGCCGAA 

451 AAACTGATAC AAAAAACCGT AGGCGAATAA 



Number 42 ORF 

1 ATGGCTTTTA TTACGCGCTT ATTCAAAAGC .-.GTAAATGGC TGATTGTGCC 

51 GCTGATGCTC CCCGCCTTTC AGAATGTGGC G3CGGAGGGG ATAGATGTGA 

101 GCCGTGCCGA AGCGAGGATA ACCGACGGCG 33CAGCTTTC CATCAGCAGC 

151 CGCTTCCAAA CCGAGCTGCC CGACCAGCTC CAACAGGCGT TGCGCCGGGg 

201 CGTGCCGCTC AACTTTACCT TAAGCTGGCA 3CTTTCCGCC CCGATAATCG 

251 CTTCTTATCG GTTTAAATTG GGGCAACTGA rTGGCGATGA CGACaATATT 

301 GACTACAAAC TGAGTTTCCA TCCGCTGACc AaACGCTACC GCGTTACCgT 

351 CGgCGCGTTT TCGACAGACT ACGACACCTT GGATGCGGCA TTGCGCGCGA 

401 CCGGCGCGGT TGCCAACTGG AAAGTCCTGA ACAAAGGCGC GCTGTCCGGT 

451 GCGGAAGCAG GGGAAACCAA GGCGGAAATC CGCCTGACGC TGTCCACTTC 

501 AAAACTGCCC AAGCCTTTTC AAATCAATGC ATTGACTTCT CAAAACTGGC 

551 ATTTGGATTC GGGTTGGAAA CCTCTAAACA TCATCGGGAA CAAATAA 



Number 43 ORF 



1 


ATGGACACAA AAGAAATCCT 


CGG.TACGCG 


GcAGGcTCGA 


TCGGCAGCGC 


51 


GGTTTTAGCC 


GTCATCATCc 


TGCCGCTGCT 


GTCGTGGTAT 


TTCCCCGCCG 


101 


ACGACATCGG 


GCGCATCGTG 


CTGATGCAGA 


CGGCGGCGGG GCTgACGGTG 


151 


TCGGTGTTGT 


GCCTCGGGCT 


GGATCAGGCA 


TACGTCCGCG 


AATACTATGC 


201 


CACCGCCGAC 


AAAGACAcCT 


TGTTCAAAAC 


CCTGTTCCTG 


CCGCCGCTGC 


251 


TGTCTGCCGC 


CGCGATAGCC 


GCCCTGCTGC 


TTTCCCGCCC 


GTCCCTGCCG 


301 


TCTGAAATCC 


TGTTTTCACT 


CGACGATGCC 


iCCGCCGGCa 


TCGGGCTGGT 


351 


GCTGTTTGAA 


CtGAGCTTCC 


TGCCCATCCG 




CTGGTTTTGC 


401 


GTATGGAAGG 


ACGCGCCcTT 


GCCTTTTCGT 


CCGCGCAACT 


CGTGCcCAAG 


451 


CTCGCCATCC 


TGCTGCTG.T 


GCCGCTGACG 


GTCGGGCTGC 


TGCACTTTCC 


501 


AGCGAACACC 


GCCGTCCTGA 


CCGCCGTTTA 


GGCGCTGGCA 


AACCTTGCCG 


551 


CCGCCGCCTT 


TTTGCTGTTT 


GAAAACCGA" 


GCCGTCTGAA 


GGCCGTCCGG 


601 


CACGCACCGT 


TTTCGCCCGC 


CGTCCTGCAC 


CGGGGG.TGC 


GCTACGGCAT 


651 


ACCGATCGCA 


CTGAGCAGCA 


rCGCCTATTG 


3GGGCTGGCA 


TCCGCCGACC 


701 


GTTTGTTCCT 


GAAAAAATAT 


GCCGGCCTGG 


AACAGCTCGG 


CGTTTATTCG 


751 


ATGGGTATTT 


CGTTCGGCGG 


GGCGGCATTA 


7TGTTCCAAA 


GCATCTTTTC 


801 


AACGGTCTGG 


ACACCGTATA 


TTTTCCGCGC 


AATCGAAGAA AACGCCCCGC 


851 


CCGCTCGCCT 


CTCGGCAACG 


GCAGAATCCG 


CCGCCGCCCT 


GCTTGCCTCC 


901 


GCCCTCTGC . 


TGACCGGCAT 


TTTCTCGCCC 


CTTGCCTCCC 


TCCTGCTGCC 


951 


GGAAAACTAC 


GCCGCCGTCC 


GGTTTATCGT 


CGTATCGTGT 


ATG.TGCCGC 


1001 


CGCTGTTTTG 


CACGCTGGCG 


GAAATCAGCG 


3CATCGGTTT 


GAACGTCGTT 


1051 


CGCAAAACGC 


GCCCGATCGC 


GCTCGCCACC 


TTGGGCGCGC 


TGGCGGCAAA 


1101 


CCTGCTGCTG 


CTGGGGCTTG 


ACCGTGCCGT 


ACCGGCGAGG 


CCGCC.GGCG 


1151 


CGGCGGTTGC 


CTGTGCCGCC 


TCATTCTGGC 


TGTTTTTTGC 


CTTCAAGACC 


1201 


GAAAGCTCyT 


GCCGCCTGTG 


GCAGCCGCTC 


AAACGCCTGC 


CGCTTTATCT 


1251 


GCACACATTG 


TTCTGCCTGA 


CCTCCTCGGC 


3GCCTACACC 


TGCTTCGGCA 


1301 


CGCCGGCAAA 


CTATCCCCTG 


TTTGCCGGCG 


r-ATGGGCGGC 


ATATCTGGCA 


1351 


GGCTGCATCC 


TGCGCCACCG 


GAAAGATTTG 


CACAAACTGT 


TTCATTATTT 


1401 


GAAAAAACAA 


GGTTTCCCAT 


TATGA 







Number 44 ORF 

1. -ATCCTGAAAC CGCATAACCA GCTTAAGGAA GACATCCAAC CTGATCCGGC 
51 CGATCAAAAC GCCTTGTCCG AACCGGATGC rGCGACAGAG GCAGAGCAGT 
101 CGGATGCGGA AAATGCTGCC GACAAGCAGC CCGTTGCCGA TAAAGCCGAC 
151 GAGGTTGAAG AAAAGGCGGG CGAGCCGGAA CGGGAAGAGC CGGACGGACA 
201 GGCAGTGCGT AAGAAAGCGC TGACGGAAGA 3CGTGAACAA ACCGTCAGGG 
251 AAAAAGCGCA GAAGAAAGAT GCCGAAACGG TTAAAATACA AGCGGTAAAA 
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301 CCGTCTAAftG AAACAGAGAA AAAAGCTTCA AAAGAAGAGA AAAAGGCGGC 

351 GAAGGAAAAA GTTGCACCCA AACCAACCCC GGAACAAATC CTCAACAGCG 

401 GCAgCATCGA AAAmGCGCGC AgTGCCGCCG CCAAAGAAGT GCAGAAAATG 

451 AA.AACGTCC GACAAGGCGG AAGC.AACGC ATTATCTGCA AATGGGCGCG 

501 TATGCCGACC GTCAGAGCGC GGAAGGGCAG CGTGCCAAAC TGGCAATCTT 

551 GGGCATATCT TCCAAGGTGG TCGGTTATCA GGCGGGACAT AAAACGCTTT 

601 ACCGGGTGCA AAGCGGCAAT ATGTCTGCCG ATGCGGTGA 



Number 45 ORF 

1 ATGAACCACG ACATCACTTT CCTCACCCTG TTCCTACTCG GTkTCTTCGG 

51 CGGAAcGCAC TGCATCGGTA TGTGCGGCGG ATTAAGCAGC GcGTTTGs.s 

101 TCCAACTCCC CCCGCATATC AACCGCTTTT GGCTGATCCT GCTGCTTAAC 

151 ACAGGACGGG TAAGCAGCTA TACGGCAAtC GGCCTGATAC TCGGATTAAT 

201 CGGACAGGTC GGCGTTTCAC TCGAcCAaAC CCGCGTCCTG CAGAATATTT 

251 TATACACGGC CGCCAACCTC CTGCTGCTCT TTTTAGGCTT ATACTTGAGC 

301 GGTATTTCTT CCTTGGCGGC AAAAATCGAG AAaATCGGCA AACCGATATG 

351 GCGGAACCTG AACCCGATAC TCAACCGGCT GTTACCCATA AAATCCATAC 

4 01 CCGCCTGCCT tGCGgTCGGA ATATTATGGG GCTGGCTGCC GTGCGGACTG 

4 51 GTTTACAGCG CGTCGCTTTA CGCGCTGGGA AgCGGTAGTG CGGCAACGGG 

501 CGGGTTATAT ATGCTTGCCT TTGCACTGGG TACGCTGCCC AATCTTtTAG 

551 CAATCGGCAT TTTtTCCCTG CAACTGAAwA AAATCATGCA AAACCGATAT 

601 ATCCGCCTGT GTACGGGATT ATCCGTATCA TTATGGGCAT TATGGAAACT 

651 TGCCGTCCTG TGGCTGTAA 



Number 46 ORF 

1 ATGGAAAACC AAAGGCCGCT CCTAGGCTTT CGCTTGGCAC TTTTGGCGGC 

51 GATGACGTGG GGAACGCTGC CGAT.TCCGT GCGGCAGGTA TTGAAGTTTG 

101 TCGATGCGCC GACGCTGGTG TGGGTGCGTT TTACCGTGGC GGCGGCGGTA 

151 TTGTTTGTTT TGCTGGCACT GGGCGGGCGG CTGCcGAAGC GGCGgGGATT 

201 TTTCTTGGTG CTCATTCAGG CTGCTGCTGC TCGGCGTGGC GGGCATTTCG 

251 GCAAACTTTG TGCTGATTGC CCAAGGGCTG CATTATATTT CGCCGACCAC 

301 GACGCAGGTT TTGTGGCAGA TTTCGCCGTT TACGATGATT GTwGTCGGTG 

351 TGTTGGTGTT TAAAGACCGG ATGACTGCCG CTCAGAAAAT CGGCTTGGTT 

4 01 TTGCTGCTTG CCGGTTTGCT TATGTATTTT AACGATAAAT TCGGCGAGTT 

451 GTCGGGTTTG GGCGCGTATG C.AAGGGCGT GTTGCTGTGT GCGGCAGGCA 

501 GTATGGCATG GGTGTGTAAT GCCGTGGCGC AAAAGCTGCT GTCGGCGCAA 

551 TTCGGGCCGC AACAGATTCT GCTGTTGATT TATGCGGCAA GTGCCGCCGT 

601 GTTCCTGCCG TTTGCCGAAC CGGCACACAT CGGAAGTATG GACGGTACGT 

651 TGGCGTGGGT ATGTATTGCG TATTGCTGCT TGAATACGTT AATCGGTTAC 

701 GGCTCGTTCG GCGAGGCGTT GAAACATTGG GAGGCTTCCA AAGTCAGCGC 

751 GGTAACAACC TTGCTCCCCG TGTTTACCGT AATAAATACT TTGCTCGGGC 

801 ATTATGTGAT GCCTGAAACT TTTGCCGCGC CGGA. . 



Number 47 ORF 

1 ATGGTAGCTC GTCGGGCTCA TAACCCGAAG GTCGTAGGTT CGAATCCTGT 

51 .CCCGCAACC TAATTTCAAA CCCCTCGGTT CAATGCCGAG GG . GTTTTGT 

101 T.TTGCCTGT TTCCTGTTTC CTGTTTCCTG CCGCCTCCGT TTTTTGCCGG 

151 ATTTTCCTTC CGGCCGCAAT ATCGGAACGG CAGACCGCCG TCTGTTTGCG 

201 GTTGCAAATT CAGGCAGTTT GGCTACAATC TTCCGCATTG TCTTCAAGAA 

251 AGCCAACCAT GCCGACCGTC CGTTTTACCG AATCCGTCAG CAAACAAGAC 

301 CTTGATGCTC TGTTCGAGTG GGCAAAAGCA AGTTACGGTG CAGAAAGTTG 

351 CTGGAAAACG CTGTATCTGA ACGGTCysCC TTTGGGCAAC CTGTCGCCGG 

401 AATGGGTGGA ACGCGTsmmA AAAGACTGGG AGGCAGGCTG CyCGGAGTCT 

451 TCAGACGGCA TTTTTCTGAA TgCGGACGGc TGgCctGATA TGGgCGGAcg 

501 cTTACAGCAC CTCGCCCTCG GTTGGCACTG TGCGGGGCTG TTGGACGgsT 

551 GGCGCAACGA GTGTTTCGAC CTGACCGACG GCGGCGGCAA CCCCTTGTTC 

601 ACGCTCGaAc GCGCCGyTTT mCGTCCTkTC GGACTGCTCA GCCGCGCCGT 

651 CCATCTCAAC GGTCTGACCG AATCGGACGG CCGATGGCAT TTCTGGATAG 

701 GCAGGCGCAG TCCGCACAAA GCAGTCGATC CCAACAAACT CGACAATACT 
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751 rCCGCCGGCG GTGTTTCCGG CGGCGAAATG CCGTCTGAAG CCGTGTGTCG 
801 CGAAAGCAGC GAAGAAGCCG GTTTGGATAA AACGCTGcTT CCGCTCATCC 
851 GCCCGGTATC GCAGCTGCAC AGCCTGCGCT CCGTCAGCCG GGGTGTACAC 
901 AATGAAATCC TGTATGTATT CGATGCCGTC CTGCCG. . . 



Number 48 ORF 

1 ATGAATAGAC CCAAGCAACC 

51 CGAAAGCAGC CTGACGGGTA 

101 CCCTATGGAC GACATTTGCA 

151 TTGATATTTG GTAACTATAC 

201 ACCTGCATCG GGCGTAATCA 

251 CAGCGAAATT CGTGGAAGAT 

301 TTTGCGCTTT CGACCTCACG 

351 GTTGAAAACG GAGGCAGTTT 

401 GTCGTCTGAA GCTGATACAC 

451 GTCGAACGTT TGGAAAACCA 

501 TCAGAAAAGG CGCATTAGAC 

551 TCCTATCCGC .CAATGA 



CTTCTTCCGT CCCGAAGTCG CCGTTGCCCG 
AAGTGATTCT GACACGACCG TTGTCATTTT 
TCGATATCTG CGTTATTGAT TATCCTGTTT 
GCGAAAGACA ACAGTGGAGG GACAAATTTT 
GGGTGTATGC ACCGgATACG rGkACAATTA 
GGmsAAAAGG TTAAGGCTGG CGACAAGCTA 
TTTCGGCGCA GGAGGTAGCG TGCAGCAGCA 
TGAAGAAAAC GTTGGCAGAA CAGGAACTGG 
GGGAATGAAA CGCGCAgCcT TAAAGCAACT 
GGAACTCCAT ATTTCGCAAC AGATAGACGG 
TTGCGGAAGA AATGTTGCAG AAATATCGTT 



Number 49 ORF 

1 ATGCTGAATA CTTTTTTTGC CGTATTGGGC GGCTGCCTGC TGCT.TTGCC 

51 GTGCGGCAAA TCCGTAAATA CGGCGGTACA GCCGCAAAAC GCGGTACAAA 

101 GCGCGCCGAA ACCGGTTTTC AAAGTCATAT ATATCGACAA TACGGCGATT 

151 GCCGGTTTGG ATTTGGGACA AAGCAGCGAA GGCAAAACCA ACGACGGCAA 

201 AAAACAAATC AGTTATCCGA TTAAAGGCTT GCCGGAACAA AATGTTATCC 

251 GACTGATCGG CAAGCATCCC GGCGACTTGG AAGCCGTCAG CGGCAAATGT 

301 ATGGAAACCG ATGATAAGGA CAGTCCGGCA GGTTGGGCAG AAAACGGCGT 

351 GTGCCATACC TTGTTTGCCA AACTGGTGGG CAATATCGCC GAAGACGGCG 

401 GCAAACTGAC GGATTACCTA GTTTCGCATG CCGCCCTGCA ACCCTATCAG 

451 GCAGGCAAAA GCGGCTATGC CGCCGTGCAG AACGGACGCT ATGTGCTGGA 

501 AATCGACAGC GAAGGGGCGT TTTATTTCCG CCGCCGCCAT TATTGA 



Number 50 ORF 

1 ATGGAAGATT TATATATAAT 

51 CGgATTTATC GATgcgatTg 

101 CACTCTTGTT GGCAGGTATT 

151 CTGCAAgCAG CCGCTGCTAC 

201 AGGTTTGATT GATTGGAAGA 

251 TAGGCGGCGT GGcCGGTGCA 

301 CTgCTgGCGG TCGTGCCGGT 

351 GTTTTCGCCC AAGCTCGACG 

401 TTTTTCTGTT cGGGCTGACG 

4 51 TGTGTTCGGA CCGGGTGTCG 

501 TGCTCGGCTG CAAgCTGTTG 

551 GTTGCCTGCA ATCTTGGTTC 

501 TATTTTCCCG ATTGCGGCAA 

651 ATTTAgGTGC GAGATTTGCC 



ACTCGCTTTG GGTTTGGTTG CGATGATTGC 
cGggCGGGGG TGGTTTGATT ACGCTGCCCG 
CCTCCCGTGT CGGCAATTGC CACCAACAAG 
GTTTTCAGCT ACGGTTTCTT TTGCACGCAA 
AAGGTCTCCC GATTGCCGCA GCATCGTTTG 
TTATCGGTCA GCTTGGTTTC CAAAGATATT 
TTTGTTGATA TTTGTCGCAC TGTATTTTGT 
GCAGTAAGGA AGGCAAAGCC AGAATGTCTT 
GTCGC.ACCG CTTTTGGGTT TTTACGACGG 
GCTCGTTTTT TCTGATTGCC TTTATTGTTT 
AACGCGATGT CTTACACCAA ATTGGCGAAC 
GCTATCGGTA TTCCTGCTGC ACGGTTCGAT 
CGaTGGCGGT CGGTGCGTTT GTCGGtGCGA 
GTaCgctTCG GTTCGAAGCT GATTAA 



1 


. . CTGCTAGGGT ATTGCATCGG 


TTATCGGTAC 


GGCTGTTGCA 


GCAAAACCAG 


51 


CCGCAGACGG 


ATTATTTGGT 


CAAATTCGGA 


TCGTTTTGGG 


CGAG.ATTTT 


101 


TGGTTTTCTG 


GGACTGTATG 


ACGTCTATGC 


TTCGGCATGG 


TTTGTCGTTA 


151 


TCATGATGTT 


TTTGGTGGTT 


TCTACCAGTT 


TGTGCCTGAT 


TCGCAATGTG 


201 


CCGCCGTTCT 


GGCGCGAAAT 


GAAGTCTTTT 


CGGGAAAAGG 


TTAAAGAAAA 


251 


ATCTCTGGCG 


GCGATGCGCC 


ATTCTTCGCT 


GTTGGATGTA 


AAAATTGCGC 


301 


CCGAGGTTGC 


CAAACGTTAT 


CTGGAAGTAC 


AAGGTTTTCA 


GGGGAAAACC 


351 


ATTAACCGTG 


AAGACGGGTC 


GGTTCTGATT 


GCCGCCAAAA 


AAGGCACAAT 
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401 GAACAAATGG GGCTATATCT TTGCCCATGT TGCTTTGATT GTCATTTGCC 

451 TGGGCGGGTT GATAGACAGT AACCTGCTGT TGAAACTGGG TATGCTGACC 

501 GGTCGGATTG TTCCGGACAA TCAGGCGGTT TATGCCAAGG ATTTC.AAGC 

551 CCGAAAGTAT . TTTGGGTGC gTCCAATCTC TCATTTAGGG GCAACGTCAA 

601 TATTTCCG.A GGGGCAGAgT GCGGATGTGG TTTTCCTGA 



Number 52 ORF 

1 ATGCCGTCTG AAACACGCCT GCCGAACTTT ATCCGCGTCT TGATATTTGC 

51 CCTGGGTTTC ATCTTCCTGA ACGCCTGTTC GGAACAAACC GCGCAAACCG 

101 TTACCCTGCA AGGCGAAACG ATGGGCACGA CCTATACCGT CAAATACCTT 

151 TCAAATAATC GGGACAAACT CCCCTCACCT GCCGAAATAC AAAAACGCAT 

201 CGATGACGCG CTTAAAGAAG TCAACCGGCA GATGTCCACC TATCAGCCCG 

251 ACTCCGAAAT CAGCCGGTTC AACCAACACA CAGCCGGCAA GCCCCTCCGC 

301 ATTTCAAGCG ACTTCGCACA CGTTACTGCC GAAGCCGTCC GCCTGAACCG 

351 CCTGACACAC GGCGCGCTGG ACGTAACCGT CGGCCCCTTG GTCAACCTTT 

4 01 GGGGATTCGG CCCCGACAAA TCCGTTACCC GTGAACCGTC GCCGGAACAA 

4 51 ATCAAACAGG CGGCATCTTA TACGGGCATA GACAAAATCA TTTTGAAACA 
501 AGGCAAAGAT TACGCTTCCT TGAGCAAAAC CCACCCCAAG GCCTATTTGG 

5 51 ATTTATCTTC GATTGCCAAA GGCTTCGGCG TTGATAAAGT TGCGGGCGAA 
601 CTGGAAAAAT ACGGCATTCA AAATTATCTG GTCGAAATCG GCGGCGAGTT 
651 GCACGGCAAA GGCAMAACG CGCGCGGCGA ACCGTGGCGC ATCGGTATCG 
7 01 AGCAGCCCAA TATCGTCCAA GGCGGCAATA CGCAGATTAT CGTCCCGCTG 
7 51 AACAACCGTT CGCTTGCCAC TTCCGGCGAT TACCGTATTT TCCACGTCGA 
801 TAAAAACGGC AAACGCCTCT CCCATATCAT CAACCCGAAC AACAAACGAC 
851 CCATCAGCCA CAACCTCGCC TCCATCAGCG TGGTCGCAGA CAGTGCGATG 
901 ACGGCGGACG GCTTGTCCAC AGGATTATTC GTATTGGGCG AAACCGAAGC 
951 CTTAAAGCTG GCAGAGCGCG AAAAACTCGC TGTTTTCCTG ATTGTCAGGG 

1001 ATAAAGGCGG CTACCGCACC GCCATGTCTT CCGAATTTGA AAAACTGCTC 

1051 CGCTAA 



Number 53 ORF 

1 . . CCGTGCCGCC GACAGGGCGA CGACGTGTAT GCGGCGCACG CGTCCCGTCA 

51 AAAATTGTGG CTGCGCTTCA TCGGCGGCCG GTCGCATCAA AATATACGGG 

101 GCGGCGCGGC TGCGGACGGG TGGCGCAAAG GCGTGCAAAT CGGCGGCGAG 

151 GTGTTTGTAC GGCAAAATGA AGGCAGCCkA yTGGCAATCG GCGTGATGGG 

201 CGGCAGGGCC GGCCAGCACG CwTCAGTCAA CGGCAAAGGC GGTGCGGCAG 

251 gCAGTGATTT GTATGGTTAT GgCGGGGgTG TTTATGCTgC GTGGCATCAG 

301 TTGCGCGATA AACAAACGGG TgCGTATTTG GACGGCTGGT TGCAATACCA 

351 ACGTTTCAAA CACCGCATCA ATGATGAAAA CCGTGCGGAA CgCTACAAAA 

401 CCAAAGGTTG GACGGCTTCT GTCGAAGGCG GCTACAACGC GCTTGTGGCG 

451 GAAGGCATTG TCGGAAAAGG CAATAATGTG CGGTTTTACC TACAACCGCA 

501 GgCGCAGTTT ACCTACTTGG GCGTAAACGG CGGCTTTACC GACAGCGAGG 

551 GGACGGCGGT CGGACTGCTC GGCAGCGGTC AGTGGCAAAG CCGCGCCGGC 

601 AtTCGGGCAA AAACCCGTTT TGCTTTGCGT AACGGTGTCA ATCTTCAGCC 

651 TTTTGCCGCT TTTAATGTtt TGCACAGGTC AAAATCTTTC GGCGTGGAAA 

701 TGGACGGCGA AAAACAGACG CTGGCAGGCA GGACGGCACT CGAAGGGCGG 

751 TTCGGTATTG AAGCCGGTTG GAAAGGCCAT ATGTCCGCA. . 



Number 54 ORF 

1 . . GCGGAATATG TTCAGTTCTC TATAGATTTG TTCAGTGTGG GTAAATCGGG 

51 GGGCGGTATA CCTAAGGCTA AGCCTGTGTT TGATGCGAAA CCGAGATGGG 

101 AGGTTGATAG GAAGCTTAAT AAATTGACAA CTCGTGAGCA GGTGGAGAAA 

151 AATGTTCAGG AAACGAGAAG AAGGAGTCAG AGTAGTCAGT TTAAAGCCCA 

201 TGCGCAACGA GAATGGGAAA ATAAAACAGG GTTAGATTTT AATCATTTTA 

251 TAGGTGGTGA TATCAATAAA AAAGGCACAG TAACAGGAGG GCATAGTCTA 

301 ACCCGTGGTG ATGTACGGGT GATACAACAA ACCTCGGCAC CTGATAAACA 

351 TGGGGT.TTA TCAAGCGACA GTGGAAATTN A 
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Number 55 ORF 

1 ATGAATATTC ACACCCTGCT CTCCAAACAA TSGACGCTGC CGCCATTCCT 

51 GCCGAAACGG CTGCTGCTGT CCCTGCTGAT ACTGCTTGCC CCCAATGCGG 

101 TGTTTTGGGT TTTGGCACTG CTGACCGCCA CCGCCCGCCC GATTGTCAAT 

151 TTGGACTATC TTCCCGCCGC GCTGCTGATC GCCCTGCCTT GGCGTTTCGT 

201 CAAAATTGCC GGCGTATTGG CGTTTTGGCT GGCGGTTTTG TTTGACGGGC 

251 TGATGATGGT GATCCAACTC TTCCCTTTTA TGGATCTCAT CGGCGCCATC 

301 AACCTCGTCC CCTTCATCCT GACCGCCCCC GCCCCTTATC AGATAATGAC 

351 CGGGCTG. . . 



Number 56 ORF 



1 


. . GTGAGCGGAC 


GTTACCGCGC 


51 


TACTTTGAGT 


ATCGCCACGC 


101 


GTATGCAGAT 


GCAGTCCGAT 


151 


GGTTTGGGCT 


TCCTGATCGC 


201 


AATTTCCGCC 


ATCAATTCTT 


251 


CTTCCGAATA 


CCGCGACGGG 


301 


AGTGCGGTTT 


TGGCTTTGGT 


351 


CGGCAACGGC 


GA.ACAGTGC 


401 


TGATCAATAT 


GTACGCC. . 



TTTGGATCGC GTTTCCAAAA TCATCATCGT 
TTGCCGCCGC CGGCATCGCT ATGTCGCGCG 
TTTATCGAGC CGACACCGTG GACGCTTGCC 
GCTGATGGGC TGGATGCCCG CGCCGATTGA 
TGTGGGTAAC CGAAAAACAA CGCATCAATC 
ATTTTTGAAT TCAACGTCGG TTATATCGCC 
TTTCCTTGCA CTGGGCGC.G TAGCGCCGAA 
AGATGGCGGG CGGCAAATAT AACGGGCAAT 



Number 57 ORF 



, . TTGCGGGAAA CGGCATATGT TTTGGATAGT TTTGATCGTT ATTTTGTTGT 
TGCGCTTGCC GGCTTGTTTT TTGTCCGCGC ACAATCCGAA CGCGAGTGGA 
TGCGCGAGGT TTCTGCGTGG CAGGAAAAGA AAGGGGAAAA ACAGGCGGAG 
CTGCCTGAA/i TCAAAGACGG TATGCCCGAT TTTCCCGAAC TTGCCCTGAT 
GCTTTTCCAC GCCGTCAAAA CGGCAGTGTA TTGGCTGTTT GTCGGTGTCG 
TCCGTTTCTG CCGAAACTAT CTGGCGCACG AATCCGAACC GGACAGGCCC 
GTTCCGCCT. . 



Number 58 ORF 

1 ATGATTTATC AAAGAAACCT CATCAAAGAA CTCTCTTTTA CCGCCGTCGG 

51 CATTTTCGTC GTCCTCTTGG CGGTATTGGT CTCCACGCAG GCAATCAACC 

101 TGCTCGGCCG TGCCGCCGAC GGGC..GTGA TCGCCATCGA TGCCGTGTTG 

151 GCATTGGTCG GCTTCTGGGT C 

// 

901 A TTGCCATCGG TTTGTTTTTA ATTTACCAAA ACGGGCTGAC 

951 CCTGCTTTTT GAAGCCGTGG AAGACGGCAA AATCCATTTT TGGCTCGGAC 

1001 TGCTGCCTAT GCACATTATC ATGTTTGTCC TTGCACTCAT CCTGTTGCGC 

1051 GTCCGCAGTA TGCCCAGCCA GCCCTTCTGG CAGGCGGTTG GCAAAAGTCT 

1101 GACATTGAAA GGCGGAAAAT GA 



1 . . GGTGGTGGTT TTATCAATGC TTCCTGTGCC ACTTTGACGA CAGCCAAACC 

51 GCAATATCAA GCAGGAGACC TTAGCGCTTT TAAGATAAGG CAAGGCAATG 

101 TTGTAATCGC CGGACACGGT TTGGATGCAC GTGATACCGA TTACACACGT 

151 ATTCTCAGTT ATCATTCCAA AATCGATGCA CCCGTATGGG GACAAGATGT 

201 TCGTGTCGTC GCGGGACAAA' ACGATGTGGC CGCAACAGGT GATGCACATT 

251 CGCCTATTCT CAATAATGCT GCTGCCAATA CGTCAAACAA TACAGCCAAC 

301 AACGGCACAC ATATCCCTTT ATTTGCGATT GATACAGGCA AATTAGGAGG 

351 TAT.GTATGC CAACAAAATC ACCTTGATCA GTACGGTCGA GCAAGCAGGC 

401 ATTCGTAA 
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Number 60 ORF 

1 . . TCAACGGGAC ATAGCGAACA AAATTACACT TTGCCGCGAG AAATCACACG 

51 CAACATTTCA CTGGGTTCAT TTGCCTATGA ATCGCATCGC AAAGCATTAA 

101 GCCATCATGC GCCCAGCCAA GGCACTGAGT TGCCGCAAAG CAACGGTATT 

151 TCGCTACCCT ATACGTCCAA TTCTTTTACC CCATTACCCA GCAGCAGCTT 

201 ATACATTATC AATCCTGTCA ATAAAGGCTA TCTTGTTGAA ACCGATCCAC 

251 GCTTTGCCAA CTACCGTCAA TGGTTGGGTA GTGACTATAT GCtGGACAGC 

301 CTCAAACTAG ACCCAAACAA TTTACATAAA CGTTTGGGTG ATGGTTATTA 

351 CGAGCAACGT TTAATCAATG AACAAATCGC AGAGCTGACA GGGCATCGTC 

401 GTTTAGAcGG TTATCAAAAC GACGAAGAAC AATTTAAAGC CTTAATGGAT 

451 AATGGCGCGA CTGCGGCACG TTcGATGAAT CTCAGCGTTG GCATTGCATT 

501 AAGTGCCGAG CAAGTAGCGC AACTGACCAG CGATATTGTT TGGTTGGTAC 

551 AAAAAGAAGT TAAGCTTCCT GATGGCGGCA CACAAACCGT ATTGGTGCCA 

601 CAGGTTTATG TACGCGTTAA AAATGGCGAC ATAGACGGTA AAGGTGCATT 

651 GTTGTCAGGC AGCAATACAC AAATCAATGT TTCAGGCAGC CTGAT^AAACT 

701 CAGGCACGAT TGCAGGgCGC AATGCGCTTA TTATCAATAC CGATACGCTA 

751 GACAATATCG GTGGGCGTAT TCATGCGCAA AAATCAGCGG TTACGGCCAC 

801 ACAAGACATC AATAATATTG GCGGCATGCT TTCTGCCGAA CAGACATTAT 

851 TGCTCAACGC AGGCAACAAC ATCAACAGCC AAAGCACCAC CGCCAGCAGT 

901 CAAAATACAC AAGGCAGCAG CACCTACCTA GACCGAATGG CAGGTATTTA 

951 TATCACAGGC AAAGAAAAAG GTGTTT.. 



Number 61 ORF 

1 . . TCAGGGAATA ACCTCAATGC CAAAGCTGCC GAAGTCAGCA GCGCAAACGG 

51 TACACTCGCT GTGTCTGCCA ATAATGACAT CAACATCAGC GCAGGCATCA 

101 ACACGACCCA TGTTGATGAT GCGTCCAAAC ACACAGGCAG AAGCGGTGGT 

151 GGCAATAAAT TAGTCATTAC CGATAAAGCC CAAAGTCATC ACGAAACCGC 

201 CCAAAGCAGC ACCTTTGAAG GCAAGCAAGT TGTATTGCAG GCAGGAAACG 

251 ATGCCAACAT CCTTGGCAGC AATGTTATTT CCGATAATGG CACCCAGATT 

301 CAAGCAGGCA ATCATGTTCG CATTGGTACA ACCCAAACTC AAAGCCAAAG 

351 CGAAACCTAT CATCAAACCC AGAAATCAGG ATTGATGAGT GCAGGTATCG 

4 01 GCTTCACTAT TGGCAGCAAG ACAAACACAC AAGAAAACCA ATCCCAAAGC 

451 AACGAACATA CAGGCAGTAC CGTAGGCAGC TTGAAAGGCG ATACCACCAT 

501 TGTTGCAGGC AAACACTACG AACAAATCGG CAGTACCGTT TCCAGCCCGG 

551 AAGGCAACAA TACCATCTAT GCCCAAAGCA TAGACATTCA AGCGGCACAC 

601 AACAAATTAA ACAGTAATAC CACCCAAACC TATGAACAAA AAGG.CTAAC 

651 GGTGGCATTC AGTTCGCCCG TTACCGATTT GGCACAACAA . . . 



Number 62 ORF 



1 


ATGATTTACA 


TCGTACTGTT 


TCTAGCTGTC 


GTCCTCGCCG 


TTGTCGCCTA 


51 


CAACATGTAT 


CAGGAAAACC 


AATACCGCAA 


AAAAGTGCGC 


GACCAGTTCG 


101 


GACACTCCGA 


CAAAGATGCC 


CTGCTCAACA 


GCAwAACCAG 


CCATGTCCGC 


151 


GACGGCAAAC 


CGTCCGGCGG 


GTCAGTCATG 


ATGCCGAAAC 


CCCAACCGGC 


201 


GGTCAAAAAA 


ACGGCAAAAC 


CCCAAGACCC 


CGyCATGCGC AACCTGCAAG 


251 


AACAGGATGC 


CGTCTACATC 


GCCAAGCAGA 


AACAGGCAAA 


AGCCTCCCCG 


301 


TTCAAAACCG 


AAATCGAAAC 


CGCCTTGGAA 


GAAAGCGGCA 


TTATCGGCAA 


351 


CTCCGCCCAC 


ACCGTTTCCG 


AACCCCAAAC 


CGGACATTCC 


GCAACGAAAC 


401 


CTGCCGACGC 


GTCGGCAAAA 


CCTGCACCCG 


TTCCGCAAAC 


ACCTGCAAAA 


451 


CCGCTGATTA 


CGCTCAAAGA 


ACTGTCAAAA 


GTCGAATTAT 


CCTGGTTTGA 


501 


CGTGCGCATC 


GACTTCATCT 


CCTAT. . . 







Number 63 ORF 

1 . . GCGCGGCACG GCACGGAAGA TTTCTTCATG AACAACAGCG ACAC . ATCAG 

51 GCAGATAGTC GAAAGCACCA CCGGTACGAT GAAGCTGCTG ATTTCCTCCA 

101 TCGCCCTGAT TTCATTGGTA GTCGGCGGCA TCGGCGTGAT GAACATCATG 

151 CTGGTGTCCG TTACCGAGCG CACCAAAGAA ATCGGCATAC GGATGGCAAT 

201 CGGCGCGCGG CGCGGCAATA TTTyGCAGCA GTTTTTGATT GAGGCGGTGT 

251 TAATCTGCGT CATCGGCGGT TTGGTCGGCG TGGGTTTGTC CGCCGCCGTC 
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301 AGCCTCGTGT TCAATCATTT TGTAACCGAC TTCCGGATGG ACATTTCCGC 

351 CATGTCCGTC ATCGGCGCGG TCGCCTGTTC GACCGGAATC GGCATCGCGT 

401 TCGGCTTTAT GCCTGCCAAT AAAGCAGCCA AACTCAATCC GATAGACGCA 

451 TTGGCACAGG ATTGA 



Number 64 ORF 



. . GGGACGGGAG CGATGCTGCT GCTGTTTTAC GCGGTAACGA T . CTGCCTTT 
GGCCACTGGC GTTACCCTGA GTTACACCTC GTCGATTTTT TTGGCGGTAT 
TTTCCTTCCT GATTTTGAAA GAACGGATTT CCGTTTACAC GCAGGCGGTG 
CTGCTCCTTG GTTTTGCCGG CGTGGTATTG CTGCTTAATC CCTCGTTCCG 
CAGCGGTCAG GAAACGGCGG CACTCGCCGG GCTGGCGGGC GGCGCGATGT 
CCGGCTGGGC GTATTTGAAA GTGCGCGAAC TGTCTTTGGC GGGCGAACCC 
GGCTGGCGCG TCGTGTTTTA CCTTTCCGTG ACAGGTGTGG CGATGTCGTC 
GGTTTGGGCG ACGCTGACCG GCTGGCACAC CCTGTCCTTT CCATCGGCAG 
TTTATCTGTC GTGCATCGGC GTGTCCGCGC TGATTGCCCA ACTGTCGATG 
ACGCGCGCCT ACAAAGTCGG CGACAAATTC ACGGTTGCCT CGCTTTCCTA 
TATGACCGTC GTTTTTTCCG CTCTGTCTGC CGCATTTTTT CTGGGCGAAG 
AGCTTTTCTG GCAGGAAATA CTCGGTATGT GCATCATCAT CSTCAGCGGT 
ATTTTGA 



Number 65 ORF 

1 ATGAAGCGGC GTATAGCCGT CTTCGTCCTG TTCCCGCAGA TAATCCGAGT 

51 TTTGGGACAA CTGTTGCCGA AAATCGTCAA TACAGTTCCG GCACATCGGA 

101 TGCTCTTCCA GATTTTCGGG ATGTTCTTTT TCTTCATACA CCAGCAATAT 

151 CTGCCCGGGA TCGCCGAAAT CGATTCCCCA TGCGGCATCG TGTTCGGTGC 

201 GCTCCTCTTC CGTCATCTGC CCGCGCATTG CCTGTATGGT AAAGCCGCCG 

251 TAGGGGATGC CgTTGCACAC GAACATCCAG TCGCTGATGT CGTCAACCGG 

301 AACGCAAACG cTTTCGCCTT GTTCGACATT GGTCAGTTCG CCsGGTTCAT 

351 TGTTCAGCAC ACCGTAAATA TAAAGACCGT CAAAATAAAT ATCGTCGATC 

401 CACATATGTT CGCAAATTTC GCCGTCTTCG CCGTCTTGGA AAAAAGGGAC 

451 TTTGACCATG GCAAAATCCA AGGCGGAAAT AATGCGGCGG CGTTCCCAAA 

501 AAAGcTCGCG CCAAAAATAT TTGAATGTTT TACGGGCGCG TTCGTCGGCA 

551 CGGTTTACCG GTTCGTCTGC CTGTTCTACA TAATAAATGA CGGAATCGCC 

601 CATCATATCT GCTCCTCAAC GTGTACGGTA TCTGTTTGCA CCTTACTGCG 

651 GCTTTCTgcC kTCGGCATCC GATTCGGATT TGAAAAGTTC ininrwyATTCG 

701 GAATAG 



Number 66 ORF 

1 ATGGAAAATA TGGTAACGTT 

51 CGCCGCCGCG TTGCTTGCCG 

101 GCAAGCCGGT GCAAACCGCC 

151 GGTGGCGGCG CATCTAAAGG 

201 GAAAGAAAAC GGTATTCCTG 

251 CGATTGTCGG CAACCTTTTT 

301 TTGGAAGCCG AAATTTTAGG 

351 CACCAATGGG TTTATCAAAG 

4 01 AACTCCGCGG CATGCAGATT 



TTCAAAAATC AGACCGCTTT TGGCAATCGC 
CC.TGCGGAC GGCGGGAAAT AATGCTGTCC 
AAAGCCGCCG CAGTGGTCGG TTTGGCACTC 
ATTTGCCCAT GTAGGTATTA TTAAGGTTTT 
TGAAGGTGGT TACCGGCACC TCCGCAGGTT 
GCATCGGGTA TGTCGCCCGA CCGCCTCGAA 
CAAAACCGAT TTGGTCGATT TAACCTTGTC 
GCGCAAAGCT GCAAAATTAC ATCAACCGAA 
CAGCAGTTTC CCATCAAATT TGCCGCC. . 



1 ATGTTTCGTT TACAATTCAG GCTGTTTCCC CCTTTGCGAA CCGCCATGCA 

51 CATCCTGTTG ACCGCCCTGC TCAAATGCCT CTCCCTGcTG CCGCTTTCCT 

101 GTCTGCACAC GCTGGGAAAC CGGCTCGGAC ATCTGGCGTT TTACCTTTTA 

151 AAGGAAGACC GCGCGCGCAT CGTCGCCniAT ATGCGGCAGG CGGGTTTGAA 

201 CCCCGACCCC AAAACGGTCA AAGCCGTTTT TGCGGAAACG GCAAAAGGCG 

251 GTTTGGAACT TGCCCCCGCG TTTTTCAGAA AACCGGAAGA CATAGAAACA 

301 ATGTTCAAAG CGGTACACGG CTGGGAACAT GTGCAGCAGG CTTTGGACAA 

351 ACACGAAGGG CTGCTATTC. . 
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Number 68 ORF 

1 . . GCGTGGTCGG CCGGCGAATC GTGGCGTGTG TTAATGGAAA GTGAAACGTG 

51 GCATGCGGTG TGGAATACTT TGCGCTTCTC GGCGGCGGCG GTGTATGCGG 

101 CAGCGGTTTT GGGTGTGGTG TATGCGGCGC CGGCGCGGCG GTCGGCGTGG 

151 ATGCGCGGGC TGATGTTTTA GCCGTTTATG GTGTCGCCGG TTTGTGTTTC 

201 GGCGGGCGTG CTGCTGCTTT ATCCGCAGTG GACGGCTTCG TTGCCGTTGC 

251 TGCTGGCGAT GTATGCGCTG CTGGCGTATC CGTTTGTGGC AAAAGATGTT 

301 TTATCAGCCT GGGATGCACT GCCGCCGGAT TACGGCAGGG CGGCGGCGGG 

351 TTTGGGTGCA AACGGCTTTC AGACGGCATG CCGCATCACG TTCCCCCTCT 

401 TGAAACCGGC GTTGCGGCGC GGTCTGACTT TGGCGGCGGC MCCTGCGTG 

451 GGCGAATTTG CGGCGACATT GTTTCTGTCG CGTCCGGAAT GGCAGACGCT 

501 GACGACTTTG ATTTATGCCT ATTTGGGACG CGCGGGTGAG GATAATTACG 

551 CGCGGGCGAT GGTGCTG. . 



Number 69 ORF 

1 ATGGACGGCT GGACACAGAC GCTGTCCGCG CAAACCCTGT TGGGCATTTC 

51 GGCGGCGGCA ATCATCCTCA TTCTGATTTT AATCGTCAGA TTCCGCATCC 

101 ACGCGCTGCT GACACTGGTC ATCGTCAGCC TGCTGACGGC TTTGGCAACC 

151 GGTTTGCCCA CAGGCAGCAT TGTCAAAGAC ATACTGGTCA AAAACTTCGG 

201 CGGCACGCTC GGCGGCGTGG CGCTTCTGGT CGGCCTGGGC GCGATGCTCG 

251 AACGTTTGGT C. . . 



1 . . GATTTCGGCA TATCGCCCGT GTATCTTTGG GTTGCCGCCG CGTTCAAACA 

51 TTTGCTGTCG CCGTGGGCTG CCGACTCATA CGATGTCGCA CGCTTTGCAG 

101 GCGTATTTTT TGCCGTTATC GGACTGACTT CCTGCGGCTT TGCCGGTTTC 

151 AACTTTTTGG GCAGACACCA CGGGCGCAC. GTCGTCCTGA TTCTCATCGG 

201 CTGTATCGGG CTGATTCCAG TTGCCCATTT CCTCAACCCC GCTGCCGCCG 

251 CCTTTGCCGC CGCCGGACTG GTGCTGCACG GTTATTCTTT GGCTCGCCGG 

301 CGCGTGATTG CCGCCTCTTT TCTGCTCGGT ACGGGCTGGA CGCTGATGTC 

351 GTTGGCAGCA GCTTATCCGG CAGCATTTGC CCTGATGCTG CCCTTGCCCG 

401 TACTGATGTT TTTCCGTCCG . . 



Number 71 ORF 

1 . . CAATCCGCCA AATGGTTATC GGGCCAAACT CTAGTCGGCA CAGCAATTGG 

51 GATACGCGGG CAGATAAAGC TTGGCGGCAA CCTGCATTAC GATATATTTA 

101 CCGGCCGCGC ATTGAAAAAG CCCGAATTTT TCCAATCAAG GAAATGGGCA 

151 AGCGGTTTTC AGGTAGGCTA TACGTTTTAA 



Number 72 ORF 

1 ATGCGGACGA AATGGTCAGC AGTGAGAAGC TGCTTACTTG GgCGGACACC 

51 GCCGACATCG ATACCGCTTT GAACCTGTTG TACCGTTTGC AAAAACTCGA 

101 ATTCCTCTAT GGCGATGAAA ACGGTCATTC AGACGGCATG AATTTGwCGG 

151 ACGAGCAATT GCCGTTGCTG ATGGAACAAT TGTCCGGCAG CGGTAAGGCG 

201 TTATTGGTCG ATCGGAACGG TCTGTATCTT GCCAACGCCA ATTTCCATCA 

251 TGAGGCGGCG GAAGAGTTGG GGTTGTTGGC GGCAGAAGTC GCACAGATGG 

301 AAAAGAAATA CCGGCTGCTG ATTAAGAACA AC. 



Number 73 ORF 



1 ATGACCTTTT TACAACGTTT GCAAGGTTTG GCAGACAATA AAATCTGTGC 
51 GTTTGCATGG TTCGTCGTCC GCCGCTTTGA TGAAGAACGC GTACCGCAGr 
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101 CGGCGGCAAG CATGACGTTT ACGACGCTGC TGGCACTCGT CCCCGTGCTG 

151 ACCGTGATGG TGGCGGTCGC TTCGATTTTC CCCGTGTTCG ACCGCTGGTC 

201 GGATTCGTTC GTCTCCTTCG TCAACCAAAC CATTGTGCCG CA.GGCGCGG 

251 ACATGGTGTT CGACTATATC AATGCGTTCC GCGAGCAGGC GAACCGGCTG 

301 ACGGCAATCG GCAGCGTGAT GCTGGTCGTT ACCTCGC7GA TGCTGATTCG 

351 GACGATAGAC AATACGTTCA ACCGCATCTG GaCGGGTCAA wTyCCAGCGT 

401 CCGTGGATG.. 



Number 74 ORF 

1 . .AGACACGCCC GCCGCATCCG CATCGACACC GCCATCAACC CCGAACTGGA 

51 AGCCCTCGCC GAACACCTCC ACTACCAATG GCAGGGCTTC CTCTGGCTCA 

101 GCACCGATAT GCGTCAGGAA ATTTCCGCCC TCGTCATCCT GCTGCAACGC 

151 ACCCGCCGCA AATGGCTGGA TGCCCACGAA CGCCAACACC TGCGCCAAAG 

201 CCTGCTTGAA ACACGGGAAC ACGGCTGA 



Number 75 ORF 

1 . . GCCGAAGACA CGCGCGTTAC CGCACAGCTT TTGAGCGCGT ACGGCATTCA 

51 GGGCAAACTC GTCAGTGTGC GCGAACACAA CGAACGGCAG ATGGCGGACA 

101 AGATTGTCGG CTATCTTTCA GACGGCATGG TTGTGGCACA GGTTTCCGAT 

151 GCGGGTACGC CGGCCGTGTG CGACCCGGGC GCGAAACTCG CCCGCCGCGT 

201 GCGTGAGGCC GGGTTTAAAG TCGTTCCCGT CGTGGGCGCA AC.GCGGTGA 

251 TGGCGGCTTT GAGCGTGGCC GGTGTGGAAG GATCCGATTT TTATTTCAAC 

301 GGTTTTGTAC CGCCGAAATC GGGAGAACGC AGGAAACTGT TTGCCTiAATG 

351 GGTGCGGGCG GCGTTTCCTA TCGTCATGTT TGAAACGCCG CACCGCATCG 

401 GTGCAGCGCT TGCCGATATG GCGGAACTGT TCCCCGAACG CCGATTAATG 

451 CTGGCGCGCG AAATTACGAA AACGTTTGAA ACGTTCTTAA GCGGCACGGT 

501 TGGGGAAATT CAGACGGCAT TGTCTGCCGA CGGCGACCAA TCGCGCGGCG 

551 AGATGGTGTT GGTGCTTTAT CCGGCGCAGG ATGAAAAACA CGAAGGCTTG 

601 TCCGAGTCCG CGCAAAACAT CATGAAAATC CTCACAGCCG AGCTGCCGAC 

651 CAAACAGGCG GCGGAGCTTG CTGCCAAAAT CACGGGCGAG GGAAAGAAAG 

701 CTTTGTACGA T. . 



Number 76 ORF 



1 


ATGAAAACAA 


CCGACAAACG 


GACAACCGAA 


ACACACCGCA 


AAGCCCCGAA 


51 


AACCGGTCGC 


ATCCGCTTCT 


C . GCTGCTTA 


CTTAGCCATA 


TGCCTGTCGT 


101 


TCGGCATTCT 


TCCCCAAGCC 


TGGGCGGGAC 


ACACTTATTT 


CGGCATCAAC 


151 


TACCAATACT 


ATCGCGACTT 


TGCCGAAAAT 


AAAGGCAAGT 


TTGCAGTCGG 


201 


GGCGAAAGAT 


ATTGAGGTTT 


ACAACAAAAA 


AGGGGAGTTG 


GTCGGCAAAT 


251 


CAATGACAAA 


AGCCCCGATG 


ATTGATTTTT 


CTGTGGTGTC 


GCGTAACGGC 


301 


GTGGCGGcAT 


TGGTGGGCGt 


ATCAATATAT 


TGTGAGCGTG 


GCACATAACG 


351 


GCGGCTATAA 


CAACGTTGAT 


TTTGGTGCGG 


PAGGAAk . AA 


tATCCC . GAT 


401 


CAACAwCGww 


TTACTTATAA 


AATTGTGA?iA 


CGGAAT.n^TT 


ATAAAGCAGG 


451 


GACTAAAGGC 


CATCCTTATG 


GCGGCGATTA 


TCATATGCCG 


CGTTTGCATA 


501 


AATwTGTCAC 


AGATGCAGAA 


CCTGTTGAAA 


TGACCAGTTA 


TATGGATGGG 


551 


CGGAAATATA 


TCGATCAAAA 


TAATTACCCT 


GACCGTGTTC 


GTATTGGGGC 


601 


AGGCAGGCAA 


TATTGGCGAT 


CTGATGAAGA 


TGAGCCCAAT 


AACCGCGAAA 


651 


GTTCATATCA 


TATTGCAAGT 








701 


GGCTC 


ACCAATGTTT 


ATCTATGATG 


CCCAAAAGCA 


751 


AAAGTGGTTA 


ATTAATGGGG 


TATTGCAAAC 


GGGCAACCCC 


TATATAGGAA 


801 


AAAGCAATGG 


CTTCCAGCTG 


GTTCGTAAAG 


ATTGGTTCTA 


TGATGAAATC 


851 


TTTGCTGGAG 


ATACCCATTC 


AGTATTCTAC 


GAACCACGTC 


AAAATGGGAA 


901 


ATACTCTTTT 


AACGACGATA 


ATAATGGCAC 


AGGAAAAATC 


AATGCCAAAC 


951 


ATGAACACAA 


TTCTCTGCCT 


AATAGATTAA 


AAACACGAAC 


CGTTCAATTG 


1001 


TTTAATGTTT 


CTTTATCCGA 


GACAGCAAGA 


GAACCTGTTT 


ATCATGCTGC 


1051 


AGGTGGTGTC 


AACAGTTATC 


GACCCAGACT 


GAATAATGGA 


GAAAATATTT 


1101 


CCTTTATTGA 


CGAAGGAAAA 


GGCGAATTGA 


TACTTACCAG 


CAACATCAAT 


1151 


CAAGGTGCTG 


GAGGATTATA 


TTTCCAAGGA 


GATTTTACGG 


TCTCGCCTGA 


1201 


AAATAACGAA 


ACTTGGCAAG 


GCGCGGGCGT 


TCATATCAGT 


GAAGACAGTA 


1251 


CCGTTACTTG 


GAAAGTAAAC 


GGCGTGGCAA 


ACGACCGCCT 


GTCCAAAATC 
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1301 GGCAAAGGCA CGCTG 

// 

2101 GATAAAG 

2151 TGACTGCTTC ATTGACTAAG ACCGACATCA GCGGCAATGT CGATCTTGCC 

2201 GATCACGCTC ATTTAAATCT CACAGGGCTT GCCACACTCA ACGGCAATCT 

2251 TAGTGCAAAT GGCGATACAC GTTATACAGT CAGCCACAAC GCCACCCAAA 

2301 ACGGCAACCk TAgCCtCGtG G . sAATGcCC AAGCAACATT TAATCAAGCC 

2351 ACATTAAACG GCAACACATC GGCTTCgGGC AATGCTTCAT TTAATCTAAG 

2401 CGACCACGCC GTACAAAACG GCAGTCTGAC GCTTTCCGGC AACGCTAAGG 

2 4 51 CAAACGTAAG CCATTCCGCA CTCAACGGTA ATGTCTCCCT AGCCGATAAG 

2501 GCAGTATTCC ATTTTGAAAG CAGCCGCTTT ACCGGACAAA TCAGCGGCGG 

2551 CAagGATACG GCATTACACT TAAAAGACAG CGAATGGACG CTGCCGTCAg 

2 601 GarCGGAATT AGGCAATTTA AACCTTGACA ACGCCACCAT TACaCTCAAT 

2651 TCCGCCTATC GCCACGATGC GGCAGGGGCG CAAACCGGCA GTGCGACAGA 

2701 TGCGCCGCGC CGCCGTTCGC GCCGTTCGCG CCGTTCCCTA TTATmCGTTA 

2751 CACCGCCAAC TTCGGTAGAA TCCCGTTTCA ACACGCTGAC GGTAAACGGC 

2801 AAATTGAACG GTCAGGGAAC ATTCCGCTTT ATGTCGGAAC TCTTCGGCTA 

2851 CCGCAGCGAC AAATTGAAGC TGGCGGAAAG TTCCGAAGGC ACTTACACCT 

2901 TGGCGGTCAA CAATACCGGC AACGAACCTG CAAGCCTCGA ACAATTGACG 

2951 GTAGTGGAAG GAAAAGACAA CAAACCGCTG TCCGAAAACC TTAATTTCAC 

3001 CCTGCAAAAC GAACACGTCG ATGCAGGCGC GTGG 



3551 
3601 


CCGCAACGCC 


GTTTGGACAA 


GCGGCATCCG 


CGCGTATTTG 
GGACACCAAA 


CCGAAGACCG 
CACTACCGTT 


3651 


CGCAAGATTT 


CCGCGCCTAC 


CGCCAACAAA 


CCGACCTGCG 


CCAAATCGGT 


3701 


ATGCAGAAAA 


ACCTCGGCAG 


CGGGCGCGTC 


GGCATCCTGT 


TTTCGCACAA 


3751 


CCGGACCGAA 


AACACCTTCG 


ACGACGGCAT 


CGGCAACTCG 


GCACGGCTTG 


3801 


CCCACGGCGC 


CGTTTTCGGG 


CAATACGGCA 


TCGACAGGTT 


CTACATCGGC 


3851 


ATCAGdCGCG 


GGCGCGGGTT 


TTAGCAGCGG 


CAGCCTTTcA 


GACGGCATCG 


3901 


GAGsmAAAwT 


CCGCCGCCGC 


GTGCtGCATT 


ACGGCATTCA 


GGCACGAtAC 


3951 


CGCGCCGgtt 


tCggCGgATt 


CGGCATCGAA 


CCGCACATCG 


GCGCAACGCg 


4001 


CtATTTCGTC 


CAAAAAGCGG 


ATTACCGCTA 


CGAAAACGTC 


AATATCGCCA 


4051 


CCCCCGGCCT 


TGCATTCAAC 


CGcTACCGCG 


CGGGCATTAa 


GGCAGATTAT 


4101 


TCATTCAAAC 


CGGCGCAACA 


CATTTCCATC 


ACGCCTTATT 


TGAGCCTGTC 


4151 


CTATACCGAT 


GCCGCTTCGG 


GCAAAGTCCG 


AACACGCGTC 


AATACCGCCG 


4201 


TATTGGCTCA 


GGATTTCGGC 


AAAACCCGCA 


GTGCGGAATG 


GGgCGTAAAC 


4251 


GCCGAAATCA 


AAGGTTTCAC 


GCTGTCCCTC 


CACGCTGCCG 


CCGCCAAAGG 


4301 


CCCGCAACTG 


GAAGCGCAAC 


ACAGCGCGGG 


CATCAAATTA 


GGCTACCGCT 


4351 


GGTAA. . . 











1 


. .AAGGTGTGGC 


AATTTGTCGA AGA.CCGCTG 


CGTGCCGTCG 


TGCCTGCCGA 


51 


CAGTTTTGAA 


CCGACCGCGC 


AAAAATTGAA 


CCTGTTTAAG 


GCGGGTGCGG 


101 


CAACCATTTT 


GTTTTATGAA 


GATCAAAATG 


TCGTCAAAGG 


TTTGCAGGAG 


151 


CAGTTCCCTG 


CTTATGCCGC 


TAACTTCCCC 


GTTTGGGCGg 


ATCAGGCAAA 


201 


CGCGATGGTG 


CAGTATGCCG 


TTTGGACGAC 


ACTTGCCGCG 


GTCGGCGTAG 


251 


GTGCAAACCT 


GCAACATTAC 


AATCCCTTGC 


CCGATGCGGC 


GATTGCCAAA 


301 


GCGTGGAATA 


TCCCCGAAAA 


CTGGTTGTTG 


CGCGCACAAA 


TGGTTATCGG 


351 


CGGTATTGAA 


GGGGCGGCAG 


GTGAAAAGAC 


CTTTGAACCC 


GTTGCAGAAC 


401 


GTTTGAAAGT 


GTTCGGCGCA 


TAA 







Number 78 ORF 



1 


. . GGCTACAACT 


ACCTGTTCGC 


GCGCGGCAGC 


CGCATCGCCA 


ACTACCAAAT 


51 


CAACGGCATC 


CCCGTTGCCG 


ACGCGCTGGC 


CGATACGGGt 


CAATGCCAAC 


101 


ACCGCCGCCT 


ATGAGCGCGT 


AGAAGTCGTG 


CGCGGCGTGG 


CGGGGCTGCT 


151 


GGACGGCACG 


GGCGAGCCTT 


CCGCCACCGT 


CAATCTGGTG 


CGCAAACGCC 


201 


TGACCCGCAA 


GCCATTGTTT 


GAAGTCCGCG 


CCGAAGCgGG 


CAACCGcAAA 


251 


CATTTCGGGC 


TGGACGCGGA 


CGTATCGGGC 


AGCCTGAACA 


CCGAAG.crC 


301 


rCTGCGCgGC 


CGCCTGGTTT 


CCAcCTTCGG 


ACGCGGCGAC 


TCGTGGCGGC 


351 


GGCGCGAACG 


CAGCCG3kAT 


GCCGAACTCT 


ACGGCATTTT 


GGAATACGAC 


401 


ATCGCACCGC 


AAACCCGCGT 


CCACGCArGC 


ATGGACTACC 


AGCAGGCGAA 
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451 AGAflACCGCC GACGCGCCGC TCAGcTACGC CGTGTACGAC AGCCAAGGTT 

501 ATGCCACCGC CTTCGGCCCG AAAGACAACC CCGCCACAAA TTGGGCGAAC 

551 AGCCACCACC GTGCGCTCAA CCTGTTCGCC GGCATCGAAC ACCGCTTCAA 

601 CCAAGACTGG AAACTCAAAG CCGAATACGA CTAC. . 



Number 79 ORF 

1 ATGCGCACGG CAGTGGTTTT 

51 GGCAATGATG CCGGAAATGG 

101 TCATATCCAA GCCGACCGAA 

151 AGCGTCAgcA CGCCTGCTTC 

201 AACGGGGATA AACGcGCCAC 

251 TGCCGCCTTT TTTCACGGCA 

301 CCGTGCGTAC CGCAGACGCT 

351 TnAGTCGCCG ACGGGG. .■ 



GCTGTTGATC ATGCCGATGG CGGCTTCGTC 
TGTGCGCGGG CGTGTCGCCG GGAACGGCAA 
CAAACGGCGG TCATGGCTTC GAGTTTGTCC 
GGCGgcGgCa ATCATACCTT CGTCTTCGGA 
TCAAACCCCC GACCGCGCTG GAAGCCATCA 
TCGTTCAGCA ATGCCAAAGC TGCTGTTGTG 
CAAGCCCATT TnTTCAAGAA TGCGTGCCAC 



Number 80 ORF 



, .ACCGACGTGC AAAAAGAGTT GGTCGGCGAA CAACGCAAGT GGGCGCAGGA 

AAAAATCAGC AACTGCCGAC AAGCCGCCGC GCAGGCAGAC CGGCAGGAAT 

ACGCCGAATA CCTCAAGCTG CAATGCGACA CGCGGATGAC GCGCGAACGG 

ATACAGTATC TTCGCGGCTA TTCCATCGAT TAG 



Number 81 ORF 

1 ATGCAGCTGA TCGACTATTC ACATTCATTT 7TCTCGGTTG TGCCACCCTT 

51 TTTGGCACTG GCACTTGCCG TCATTACCCG CCGCGTACTG CTGTCTTTAG 

101 GCATCGGTAT TCTGGwysGC GTTGCCTTTT TGGTCGGCGG CAACCCCGTC 

151 GACGGTCTGA CACACCTGAA AGACATGGTC GTCGGCTTGG CTTGGTCAGA 

201 CGsyGATTGG TCGCTGGGCA AACCAAAAAT CTTGGTTTTC CkGATACTTT 

2 51 TGGGTATTTT TACTTCCCTG CTGACCTACT CCGGCAGCAA T 

// 

851 AC TTCGCTGGTA 

901 TTCGGCGGCA CTTGCGGCGT CTTTGCCGTC GTTCTCTGCA CGCTCGGCAC 

951 GATTAAAACC GCCGACTATC CCAAAGCCGT TTGGCAGGGT GCGAAATCTA 

1001 TGTTCGGCGC AATCGCCATT TTAATCCTCG CTTGGCTCAT CAGTACGGTT 

1051 GTCGGCGAAA TGCACACCGG CGATTACCTC TCCACACTGG TTGCGGGCAA 

1101 CATCCATCCC GGCTTCCTGC CCGTCATCCT CTTCCTGCTC GCCAGCGTGA 

1151 TGGCGTTTGC CACAGGCACA AGCTGGGGGA C3TTCGGCAT TATGCTGCCG 

1201 ATTGCCGCCG CCATGGCGGT CAAAGTCGAA CCCGCGCTGA TTATCCCGTG 

1251 TATGTCCGCA GTAATGGCGG GGGCGGTATG CGGCGACCAC TGCTCGCCCA 

1301 TTTCCGACAC GACCATCCTG TCGTCCACCG GCGCGCGCTG CAACCACATC 

1351 GACCACGTTA CCTCGCAACT GCCTTACGCC TTAACCGTTG CCGCCGCCGC 

14 01 CGCATCGGGC TACCTCGCAT TGGGTCTGAC AAAATCCGCG CTGTTGGGCT 

1451 TTGGCACGAC AGGCATTGTA TTGGCGGTGC 7GATTTTTCT GTTGAAAGAT 

1501 AAAAAA. . 



Number 82 ORF 

1 . .AAGCAATGGT ATGCCGACGN .AGTATCAAG ACGGAAATGG TTATGGTCAA 

51 CGATGAGCCT GCCAAAATTC TGACTTGGGA TGAAAGCGGC CGATTACTCT 

101 CGGAACTGTC TATCCGCCAC CATCAACGCA ACGGGGTGGT TTTGGAGTGG 

151 TATGAAGATG GTTCTAAAAA GAGCGAAGT . GTTTATCAGG ATGACAAGTT 

201 GGTCAGGAAA ACCCAGTGGG ATAAGGATGG TTATTTAATC GAACCCTGA 



Number 83 ORF 



1 ATGAAACAGA CAGTCAA.AT GCTTGCCGCC GCCCTGATTG CCTTGGGCTT 
51 GAACCGACCG GTGTGGNCGG ATGACGTATC GGATTTTCGG GAAAACTTGC 
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A.GCGGCAGC ACAGGGAAAT GCAGCAGCCC .-JiTACAATTT GGGCGCAATG 
TAT.TACAAA GGACGCGCGT GCGCCGGGAT 3ATGCTGAAG CGGTCAGATG 
GTATCGGCAG CCGGCGGAAC AGGGGTTAGC CCAAGCCCAA TACAATTTGG 
GCTGGATGTA TGCCAACGG3 CGCGC.GTGC GCCAAGATGA TACCGAAGCG 
GTCAGATGGT ATCGGCAGGC GGCAGCGCAG 33GGTTGTCC AAGCCCAATA 
CAATTTGGGC GTGATATATG CCGAAGGACG TGGAGTGCGC CAAGACGATG 
TCGAAGCGGT CAGATGGTTT CGGCAGGCGG CAGCGCAGGG GGTAGCCCAA 
GCCCAAAACA ATTTGGGCGT GATGTATGCC 3AAAGANCGC GCGTGCGCCA 
AGACCG. . . 



Number 84 ORF 

1 ATGAAATTTA CCAAGCACCC 

51 TTCGCTGGCG GCTCTGTACG 

101 GCTACACGGG AACGCACkAG 

151 ATGATTTGGG GTTATGCCGG 

201 CGTCGCCACT TGGACGGGGC 

251 GGCTTGACTA TCTTTTGGCT 

301 TTGGGGTGCG TCGGCAAGCG 

351 GCGCGGTGTG CATGGCTTTG 

401 TATGTTgCCG TGTTCGCGCT 

4 51 CCACGTCCAG CTGCACAACG 

501 AGTCGGGCTT GGTGATG 



CGTCTGGGCA ATGGCGTTCC GCCCATTTTA 
GCGCATTGTC C3TATTGCTG TGGGGTTTCG 
CTGTCCGGTT rCTATTGGCA CGCGCATGAg 
ACTGGTCGTC ATCGCCTTCC TGCTGACCGC 
AGCCGCCCAC GCGGGGCGGC GTaTCTGGTC 
GGCTGCGCGG ATTGCCGCCT TTATCCCGGG 
GCATACTCGG ?ACGCTGTTT TTCTGGTACG 
CCCGTTATCC GrTCGCAGAA TCAACGCAAC 
GTTCGTCTTG 33CGGCACGC ATGCGGCGTT 
GCAACCTAGG C3GACTCTTG AGCGGATTGC 



Number 85 ORF 

1 . .ATGCCGTCTG AAGGTTCAGA CGGCmTCGGT GyCGGGGAAy CAGAAGyGGT 

51 AGCGCATGCC CAZ^TGAGACT TCGTGGGTTr TGAAGCGGGT GTTTTCCAAG 

101 CGTCCCCAGT TGTGGTAACG GTATCCGGTG TCyAArGTCA GCTTGGGyGT 

151 GATGTCGAAa CCGACACCGG CGATGACACC AAGACCyAniG CTGCTGATrC 

201 TGTkGCTTTC GTGATAGGsA GGTTTGyTGG fcmksAsyTTG TAyrATwkkG 

251 CCTssCwaTG kAGmGCCkTk CkyTGGTkk.=i swGrwArTAG TCGTGGTTTy 

301 TkTTyyCACC GAATGAACyT GATGTTTAAC GTGTCCGTAG GCGACGCGCG 

351 CGCCGATATA GGGTTTGAAT TTATCGTTGA GTTTGAAATC GTAAATGGCG 

401 GACAAGCCGA GAGAAGAAAC GGCGTGGAAG CTGCCGTTTC CCTGATGTTT 

4 51 TGTTTGGGTT TCTTTGTAGT TGTTGTTTAT CTCTTCAGTA ACTTTTTTAG 

501 TAGAAGAATT ACTTTCTTTC CATTTTCTGT AACTGGCATA ATCTGCCGCT 

551 ATTCTCCAGC CGCCGAAATC . . 



Number 86 ORF 



ATGTTTGCTT TTTTAGAAGC CTTTTTTGTC 3AATACGGTT ATGCGGCTGT 

TTTTTTTGTA TTGGTCATCT GCGGTTTCGG CGTGCCGATT CCCGAGGATT 

TGACCTTGGT AACAGGCGGC GTGATTTCGG GTATGGGTTA TACCAATCCG 

CATATTATGT TTGCAGTCGG TATGCTCGGC GTATTGGTCG GGGACGGCAT 

CATGTTCGCC GCCGGACGAA TTTGGGGGCA GArArTCCTA rGGTTCArAC 

CTATTGCGsG CATCATGACG CCGrAACGTT ATGAGCAGGT TCAGGAAAAA 

TTCGACAAAT ACGGTAACTG GGTCTTATTT GTCGCCCGTT TCCTGCCCGG 

TTTGAGAACG GCCGTATTTG TTACAGCCGG 7ATCAGCCGC AAGGTTTCAT 
ACTTGCGTTT TATCATTATG GATGGACTGG CCGCA. . . 



Number 87 ORF 

1 ATGAAAAAAT TATTGGCGGC CGTGATGATG GCAGGTTTGG CAGGCGCGGT 

51 TTCCGCCGCC GGAGTCCACG TTGAGGACGG CTGGGCGCGC ACCACCGTCG 

101 AAGGTATGAA AATAGGCGGC GCGTTCATGA AAATCCACAA CGACGAAGCC 

151 AAACAAGACT TTTTGCTCGG CGGAAGCAGC CCCGTTGCCG ACCGCGTCGA 

201 AGTGCATACC CACATCAACG ACAACGGCGT GATGCGGATG CGCGAAGTCG 

251 AAGGCGGCGT GCCTTTGGAA GCGAAATCCG TTACCGAACT CAAACCCGGC 

301 AGCTATCATG TGATGTTTAT GGGTTTGAAA AAACAATTAA AAGAGGGCGA 

351 TAAAATTCCC GTTACCCTGA AATTTAAAAA CGCCAAAGCG CAAACCGTCC 
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401 AACTGGAAGT CAAAATCGCG CCGATGCCGG CAATGAACCA C. . . 



Number 88 ORF 

1 ATGACGGTAA CTGCGGCCGA AGGCGGCAAA GCTGCCAAGG CGTTAAAAAA 

51 ATATCTGATT ACGGGCATTT TGGTCTGGCT GCCGATTGCG GTAACGGTTT 

101 GGGTGGTTTC CTATATCGTT TCCGCGTCCG ATCAGCTCGT CAACCTGCTG 

151 CCGAAGCAAT GGCGGCCGCA ATATGTTTTG GGGTTTAATA TCCCGGGGCT 

201 GGGCGTTATC GTTGCCATTG CCGTATTGTT TGTAACCGGA TTGTTTGCCG 

251 CCAACGTATT GGGTCGGCAG ATCCTCGCCG CGTGGGACAG CCTGTTGGGG 

301 CGGATTCCGG TTGTGAAAtC CATCTATTCG AGTGTGAAAA AAGTATCCGA 

351 ATacgTGCTG TCCGACAGCA GCCGTTCGTT TAAAACGCCG GTACTCGTGC 

401 CGTTTCCCCA GCCCGGTATT TGGACGATyG CTTTCGTGTC AGGGCAGGTG 

451 TCGAATGCGG TTAAGGCCGC ATTGCCGAAs GACGGCGATT ATCTTTCCGT 

501 GTATGTTCCG ACCACGCCGA ATCCGACCGG CGGTTACTAT ATTATGGTAA 

551 AGAAAAGCGA TGTGCGCGAA CTCGATATGA GCGTGGACGA AsCATTGAAA 

601 TATGTGATTT CGCTGGGTAT GGTCATCCCT GACGACCTGC CCGTCAAAAC 

651 ATTGGCAsGA CCTATGCCGT CTGAAAAGGC GGATTTGCCC GAACAACAAT 

701 AA 



Number 89 ORF 



1 


ATgAAAACGG 


TAGTCTGGAT 


51 


GGCGCTGGCT 


TCGGGCATTT 


101 


AGACCATGCT 


CAGAATCAAC 


151 


GCCGTCGTGG 


TGTGGTATTT 


201 


ATATCCCCGA 


AAAGATGCAG 


251 


ssCGsGCTTG 


CCTTGAACAA 


301 


TGAAAAGGCG 


GAACTAGAAG 


351 


GaGAGACAAC 


CGGACTTTGG 


401 


AGATGGAAAA 


CATCGASSTG 


451 


CTGCCGGAAA 


AACAGCAGCT 


501 


GTTGAACCGG 


CGCGATTACG 


551 


CGAAGATGAA 


TGCCAACCTT 


601 


GCTTTCGACA 


GGGGCGACGC 


651 


TTCCAAGGCG 


GGCGCGTTGG 


701 


GGGCATATCC 


GTCGCCAGCT 


751 


AACCTGCCTG 


AAGCGGATTC 


801 


TATCGGTTGC 


GGAAAAGTAC 


851 


AAATGGGTCA 


AACAGCATTA 


901 


AGCCTTTGTC 


GAAAGCGTGC 


951 


CCATCGATTT 


TGCCGATGCT 


1001 


CTGCTGATGT 


ATCTCGGTCG 


1051 


GGCAAAAGGC 


TACCTTGAAG 


1101 


CGCGTTTGGT 


TCTAACAAAG 


1151 


GCGGAGGCGC 


AC. . . 



TGTCGTCCTG TTTGCCGCCG CCGTCGGACT 
ACACCGGCGA CGTGTATATC GTACTCGGAC 
CTGCACGCCT TTGTGTTAGG TTCGCTGATT 
CTTGTTTAAA TTCATTATCG GjGgTACTCA 
CGTTTCGGTT CGGCnCGTAA AGGCCkCAAG 
GGCGGGTTTG GCGTATTTTG AAGGGCGTTT 
CCTCACGCGT GTTGGTCAAC AAAGtAGGCC 
CATTGATGCT GrGCGCGCAC GCCGCCGGAC 
CGCGACCGTT ATCTTGCGGA AATCGCCAAA 
TTCCCGTTAT CTTTTGTTGG CGGAATCGGC 
AAGCGGCGGA AGCCAATCTT CATGCGGCGG 
ACGCGCCTCG TGCGTCTGCA .ATTCGTTAC 
GTTGCAGGTT CTGGCAAAAA CCGAAAAACT 
GCAAATCGGA AATGGAACGG TATCAAAATT 
GGCGGATGCT GCCGATGCCG CCGCTTTGAA 
CCGACAGCCT CAAAAACGGG GAATTGAGCG 
GAACGTTTGG GACTGTATGC CGATGCGGTC 
TCCGCAsAAC CGCCGCCCCG AGCTTTTGGA 
GCTTTTTGGG CGAGCGCGAA CAGCAGAAAG 
TGGCTGAAAG AACAGCCCGA TAACGCGCTT 
GCTCGCCTTC G3CCGCAAAC TTTGGGGCAA 
CGAGCATTGC ATTAAAGCCG AGTATTTCCG 
GTTTTCGACG AAATCGGAGA ACCGCAGAAG 



Number 90 ORF 

1 ATGATGTTTT CTTGGTTCAA GCTGTTTCAC TTGTTTTTTG TCATTTCGTG 

51 GTTTGCAGGG CTGTTTTACC TGCCGAGGAT TTTCGTCAAT ATGGCGATGA 

101 TTGATGTGCC GCGCGGCAAT CCCGAGTATG TGCGTCTGTC GGGCATGGCG 

151 GTGCGGCTGT ACCGTTTTAT GTCGCCGTTG GGCTTCGGCG CGGTCGTGTT 

201 CGGCGCGGCG ATACCGTTTG CCGCCGGCTG GTGGGGCAGC GGCTGGGTAC 

251 ACGTCAAACT GTGTTTGGGC TTGATGCTCT TGGCTTACCA GTTGTATTGC 

301 GGCGTGCTGC TGCGCCGTTT TCAGGATTAC AGCAATGCTT TTTCACACCG 

351 CTGGTACCGC GTGTTCAACG AAATCCCCGT GCTGCTGATG GTTGCCGCGC 

401 TGTATsTGGT CGTGTTCAAA CCGTTTTGA 
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Number91 ORF 

1 ATGGCAAAAA TGATGAAATG GGCGGCTGTT GCGGCGGTCG CGGCGGCAGC 

51 GGTTTGGGGC GGATGGTCTT AACTGAAGCC CGAGCCGCAC GTGCTTGATA 

101 TTACGGAAAC GGTCAGGCGC GGC // 

//.. ATTTCGTTTA CGATTTTGTC CGAACCGGAT ACGCCGATTA AGGCGAAGCT 

51 CGACAGCGTC GACCCCGGGC TGACCACGAT GTCGTCGGGC GGTTACAACA 

101 GCAGTAC3GA TACGGCTTCC AATGCGGTCT ACTATTATGC CCGTTCGTTT 

151 GTGCCGAATC CGGACGGCAA ACTCGCCACG GGGATGACGA CGCAGAATAC 

201 GGTTGAAATC GACGGCGTGA AAAATGTGCT GATTATTCCG TCGCTGACCG 

251 TGAAAAATCG CGGCGGCAAG GCGTTTGTGC GCGTGTTGGG TGCGGACGGC 

301 AAGGCGGCGG AACGCGAAAT CCGGACCGGT ATGAGAGACA GTATGAATAC 

351 CGAAGTAAAA AGCGGGTTGA AAGAGGGGGA CAAAGTGGTC ATCTCCGAAA 

401 TAACCGCCGC CGAGCAACAG GAAAGCGGCG AACGCGCCCT AGGCGGCCCG 

451 CCGCGCCGAT AA 



Number 92 ORF 

1 . .ATTCCCGCCA CGATGACATT TGAACGCAGC GGCAATGCTT ACAAAATCGT 

51 TTCGACGATT AAAGTGCCGC TATACAATAT CCGTTTCGAG TCCGGCGGTA 

101 CGGTTGTCGG CAATACCCTG CACCCTACCT ACTATAGAGA CATACGCAGG 

151 GGCAAACTGT ATGCGGAAcc CAAATTCGCC GACgGcAGCG TAACTTACGG 

201 CAAAGCGGGC GAGAGCAAPA CCGAGCAAAG CCCCAAGGCT ATGGATTTGT 

251 TCACGCTTGC CTGGCAGTTG GCGGCAAATG ACGCGAAACT CCCCCCGGGG 

301 CTGAAAATCA CCAACGGCPA. .z\AAACTTTAT TCCGTCGGCG GTTTGAATAA 

351 GGCGGGTACA GGAAAATACA GCATAGGCGG CGTGGAAACC GAAGTCGTCA 

401 AATATCGGGT GCGGCGCGGC GACGATGCGG TAATGTATTT cTTCGCACCG 

451 TCCCTGAACA ATATTCCGGC ACAAATCGGC TATACCGACG ACGGCAAAAC 

501 CTATACGCTG AAACTCAAAT CGGTGCAGAT CAACGGCCAG GCAGCCAAAC 

551 CGTAA 



Number 93 ORF 

1 ATGTATCGGA GGAAAGGGCG 

51 .GCGTTTGCC GCCTTGGTCT 

101 CTCCGTTTGC GGTTGCGGCG 

151 GAATGGTTGC AGAAAAAGGG 

2 01 GATGGTGTTT TCCTTGATTT 

2 51 CTATGC7GGT CGGGCAGTTC 

301 ATCGGTTTTA TGCAGAACAC 

351 CGGATATGTG GAAATCGATC 

4 01 ATACGGGAGA GTTGAGCAAC 

451 AGGCAGGGCG GCAATATT. . 



GGGCATCAAG CCGTGGATGG GTGCCGGTGC 
GGCTGGTTTT CGCGCTCGGC GATACTTTGA 
GTGCTGGCGT ATGTATTGGA CCCTTTGGTC 
TTTGAACCGT GCATCCGCTT CGATGTCTGT 
TGTTGTTGGC ATTATTGTTG ATTATCGTCC 
AACAATTTGG CATCGCGCCT GCCCCAATTA 
GCTGCTGCCG TGGTTGAAAA ATACAATCGG 
AGGCATCTAT TATTGCGTGG CTTCAGGCGC 
GCGCTTAAGG CGTGGTTTCC CGTTTTGATG 



Number 94 ORF 

1 . .ACTGCTTTTT CGGCGGCGCT 

51 TTTGTCCTTT GGGAAACCGT 

101 TTTGCACGTC CTGCCCGCCG 

151 CTGCGCCTCT ATGCCTTCCA 

201 TTTTGCCTTT GATGTTGACG 

251 ATGTTGGCAC GCATTTGCGG 

301 AATCACGGTC GTATCGACAT 

351 TTTGATACGC CGCACGCAAA 

4 01 GCGGCGGTGT CGGGGAAATG 

451 GAGCAGCGCG TCGGTAACGG 

501 CGAGCAGCCC TTTTTCAAAT 



GCGCTTGAGT CCATCATGAC TCGTCATATT 
ATCAACAAAC AGCCGCCATC TTAACATTTT 
CGTTCAAATG CGTACCAGCA ATACCGCCGC 
TCCGCCCGAG ATAGCCGAGT TTTTCGTTGG 
CACGAAATGT CTATGCCCAA ATCGGCGGCG 
AATGTGCGGC GCGAGTGTGG GTTTCTGTGC 
TGACCGCCTG CCAACCCTGC GCCTGAACGC 
AGGACGCGGC TGTCCGCATC TTTGAACTCT 
GCTGCCGATA TCGCCCAAAC CTGCCGCACC 
CGTGCAGCAG CGCATCGGCA TCGGAGTGTC 
GGGATTTCAA CTCCGCCAAG TATCAG. . 



Number 95 ORF 

1 . . GCCGGCGCGA GTGCGAACAA CATTTCCGCG CGTTTTGCGG AAACACCCGT 
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51 CGCTGTCAGC GTTACCCTGA TCGGCACGGT ACTTGCCGTC ATGCTGCCCG 
101 TTACCGAATA TGAAAACTTC CTGCTGCTTA TCGGCTCGGT ATTTGCGCCG 
151 ATGgGGCGGA JTTTGATTGC CGACTTTTTC GTCTTGAAAC GGCGTGA 



Number 96 ORF 

1 ATGACCCGTA TCGCCATCCT CGGCGGCGGC CTCTCGGGAA GGCTGACCGC 

51 GTTGCAGCTT GCAGAACAAG GTTATCAGAT TGCACTTTTC GATAAAAGCT 

101 GCCGCCGGGG CGAACACGCC GCCGCCTATG TAGCCGCCGC CATGCTCGCG 

151 CCTGCAGCGG A.ACGGTCGA AGCCACGCCC GAAGTGGTCA GGCTGGGCAG 

201 GCAGAGCATC CCGCTTTGGC GCGGCATCCG ATGCCGTCTG AACACGCACA 

251 CGATGATGCA GGAAAACGGC AGCCTGATTG TATGGCACGG GCAGGACAAG 

301 CCATTATCCA GCGAGTTCGT CCGCCATCTC AAACGCGGCG GCGT.ACGGA 

351 TGACGAAATC GTCCGTTGGC GCGCCGACGA CATCGCCGAA CGCGAACCGC 

401 AACTCGGCGG ACGTTTTTAA GACGGCATCT ACCTGCCGAC CGAAGC.CAG 

451 CTCGACGGGC GGCAATTATA GTCTGCACTT GCCGACGCTT TGGACGAACT 

501 GAACGTCCCC TGCCATTGGG AACACGAATG CGTCCCCGAA GCCTGCAAG. . 



Number 97 ORF 

1 ATGACTGATA ATCGGGGGTT TACGCTGGTT GAATTAATAT CAGTGGTCTT 

51 GATATTGTCT GTACTTGCTT TAATTGTTTA TCCGAGCTAT CGCAATTATG 

101 TTGAGAAAGC AAAGATAAAT GCAGTGCGGG CAGCCTTGTT AGAAAATGCA 

151 CATTTTATGG AAAAGTTTTA TCTGCAGAAT GGGAGGTTTA AACAAACATC 

201 TACCAAGTGG CCAAGTTTGC CGATTAAAGA GGCAGAAGGC TTTTGTATCC 

251 GTTTGAATGG AATCGtCGCG CGGG. .GCTT TAGACAGTAA ATTCATGTTG 

301 AAGGCGGTAG CCATAGATAA AGATAAAAAT CCTTTTATTA TTAAGATGAA 

351 TGAAAATCTA GTAACCTTTA aTTTGCAAGA AGTCCGCCAG TTCGTGTAGT 

401 GACGGGCTGG ATTATTTTAA AGGAAATGAT AAGGACTGCA AGTTACTTAA 

451 GTAG 



Number 98 ORF 

1 . . GTGTCGCTGG CTTCGGTGAT TGCCTCTCAA ATCTTCCTTT ACGAAGATTT 

51 CAACCAAATG CGGAAAACC£ GTGGAGCTAT CTGCGGTTTT CTTGTCCAAT 

101 ATTTATCTGG GGTTTCAGCA GGGGTATTTC GATTTGAGTG CCGACGAGAA 

151 CCCCGTACTG CATATCTGGT CTTTGGCAGT AGAGGAACAG TATTACCTCC 

201 TGTATCCCCT TTTGCTGATA TTTTGCTGCA AAAAAACCAA ATCGCTACGG 

251 GTGCTGCGTA ACATCAGCAT CATCCTGTTT TTGATTTTGA CTGCCTCATC 

301 GTTTTTGCCA AGCGGGTTTT ATACCGACAT CCTCAACCAA CCCAATACTT 

351 ATTACCTTTC GACACTGAGG TTTCCCGAGC TGTTGGCAGG TTCGCTGCTG 

401 GCGGTTTACG GGCAAACGCA AAACGGCAGA CGGCAAACAG CAAATGGAAA 

451 ACGGCAGTTG CTTTCATCAC TCTGCTTCGG CGCATTGCTT GCCTGCCTGT 

501 TCGTGATTGA CAAACACAAT CCGTTTATCC CGGGAATGAC CCTGCTCCTT 

551 CCCTGCCTGC TGACGGCACT GCTTATCCGG AGTATGCAAT ACGGGACACT 

601 TCCGACCCGC ATCCTGTCGG CAAGCCCCAT CGTATTTGTC GGCAAAATCT 

651 CTTATTCCCT ATACCTGTAC CATTGGATTT TTATTGCTTT CGCTCCGCTC 

701 ATTAGAGGCG GGAAACAGCT CGGACTGCCT GCCG. . 



Number 99 ORF 



. .ATTATTTACG AATACCGCTG GATGTTTCTT TACGGCGCAC TGACGACCTT 
GGGGCTGACG GTCGTGGCAA C.GCGGGCGG TTCGGTATTG GGTCTGTTGT 
TGGCGTTGGC GCGCCTGATT CACTTGGAAA AAGCCGGTGC GCCGATGCGC 
GTGCTGGCGT GGGCGTTGCG TAAAGTTTCG CTGCTGTATG TTACGCTGTT 
CCGGGGTACG CCGCTGTTTG TGCAGATTGT GATTTGGGCG TATGTGTGGT 
TTCCGTTTTT CGTC . . 
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Number 100 ORF 



. . CTGAARGAAT GCCGTCTGAA AGACCCTGTT TTTATTCCAA ATATCGTTTA 
TAAGAACATC GCCATTACTT TCCTGCTCTT GCACGCCGCC GCCGAACTTr 
GGCTGCCCGC GCAAACCGCC GGTTTTACCG CGCTCGCCGT CGGCTTCATC 
CTGCTCGCCA AGCTGCGTGA gCTTCACCAT CACGAACTCT TACGTAAACA 
cTACGTCCGC ACTTATTACy TGCTCCAACT CTTTGCCGCC GCAGgcTAgT 
TTGTGGACAG GCGCGGCGwA ATTACAAAAC CTGCCCGCyT CCGCGCCCC7 
GCACCTGATT ACCCTCGGCG GCATGATGGG CGGCGTGATG ATGGTGTGGc 
TGACCGCCGG ACTGTGGCAC AGCGGCTTTA CCAAACTCGA CTACCCCAAA 
CTCTGCCGCA TTGCCGTCCC CATCCTTTTC GCCGCCGCCG TCTCGCGCGC 
TTTCTTGrTG AACGTGAACC CGrTATTTTT CATTACCGTT CCTGCGATTC 
TGACCGCCGG CGTATTCGTA CTGTATCTTT TCrCGTTTAT ACCGATATTT 
CGGGCGAATG CGTTTACAGA CGATCCGGAr TAr 



Number 101 ORF 

1 ATGGAAATTC GGGCAATAAA ATATACGGCA ATGGCTGCGT TGCTTGCATT 

51 TACGGTTGCA GGCTGCCGGC TGGCGGGGTG GTATGAGTGT TCGTCCCTCA 

101 CCGGCTGGTG TAAGCCGAGA AAACCGGCTG CCATCGATTT TTGGGATATT 

151 GGCGGCGAGA GTCCGCCGTC TTTAGGGGAC TACGAGATAC CGCTTTCAGA 

201 CGGCAATAGT TCCGTCAGGG CAAACGAATA TGAATCCGCA CAACAATCTT 

251 ACTTTTACAG GAAAATAGGG AAGTTTGAAG C.TGCGGGCT GGATTGGCGT 

301 ACGCGTGACG GCAAACCTTT GATTGAGACG TTCAAACAGG GAGGATTTGA 

351 CTGCTTGGAA AAG . . 



Number 102 ORF 

1 ATGAAACACA TCCATATTAT CGGTATCGGC GGCACGTTTA TGGGCGGGCT 

51 TGCCGCCATT GCCAAAGAAG CGGGGTTTGA AGTCAGCGGT TGCGACGCGA 

101 AGATGTATCC GCCGATGAGC ACCCAGCTCG AAGCCTTGGG TATAGACGTG 

151 TATGAAGGCT TCGATGCCGC TCAGTTGGAC GAATTTAAAG CCGACGTTTA 

201 CGTTATCGGC AATGTCGCCA AGCGCGGGAT GGATGTGGTT GAAGCGATTT 

251 TGAACCTCGG CCTGCCtTAT ATtTcCGGCC CGCAATGGCT GTCGGAAAAC 

301 GTGCTGCACC ATCATTGGGT ACTCGGTGTG GCGGGGACgC ACGGCAAAAC 

351 GACCACCGCC TCCATGCTCG CATGGGTCTT GGAATATgCC GGCCTCGCGC 

4 01 CGGGCTTCCT TATtGGCGGC GTACC.GGAA AATttCGGCG TTTCCGCCCG 

4 51 CCTGCCGCAA ACGCCGCGCC AAGACCCGAA CAGCCAATCG CCGTTTTTcG 

501 TCATCGAAGC CGACGAATAC GACACCGCCT TTtTCGACA-i^ ACGTTCTAAA 

551 TtCGTGCATT ACCGTCCGCG TACCGCCGTG TTGAACAATC TGGAATTCGA 

601 CCACGCCGAC ATCTTTGCCG ACTTGGGCGC GATACAGACc CAGTTCCACT 

651 ACCTCGTGCG TACCGTGCCG TCTGAAGGCT TAATCGTCTG CAACGGACGG 

7 01 CAGCAAAGCC TGCAAGATAC TTTGGACAAA GGCTGCTGGA CGCCGGTGGA 

751 AAAATTCGGC ACGGAACACG GCTGGCA. . 



Number 103 ORF 

1 . . CCGGGCTATT ACGGCTCGGA TGACGAATTT AAGCGGGCAT TCGGAGAAAA 

51 CTCGCCGACA TmCAAGAAAC ATTGCAACCG GAGCTGCGGG ATTTATGAAC 

101 CCGTATTGAA AAAATACGGC AAAAAGCGCG CCAACAACCA TTCGGTCAGC 

151 ATTAGTGCGG ACTTCGGCGA TTATTTCATG CCGTTCGCCA GCTATTCGCG 

201 CACACACCGT ATGCCCAACA TCCAAGAAAT GTATTTTTCC CAAATCGGCG 

2 51 ACTCCGGCGT TCACACCGCC TTAAAACCAG AGCGCGCAAA CACTTGGCAA 

301 TTTGGCTTCr ATACCTATAA AAAAGGATTG TTAAAACAAG ATGATACATT 

351 AGGATTAAAA CTGGTCGGCT ACCGCAGCCG CATCGACAAC TACATCCACA 

401 ACGTTTACGG GAAATGGTGG GATTTGAACG GGGATATTCC GAGCTGGGTC 

451 AGCAGCACCG GGCTTGCCTA CACCATCCAA CATCGCrATT TCAwAGACAA 

501 AGTGCATCAA nnnnnnnnnn nnnnr.nnnnn nnnnTACGAT TATGGGCGTT 

551 TTTTCACCAA CCTTTCTTAC GCCTATCAAA AAAGCACGCA ACCGACCAAC 

601 TTCAGCGATG CGAGCGAATC GCCCAACAAT GCGTCCAAAG AAGACCAACT 

651 CAAACAAGG7 TATGGGTTGA GCAGGGTTTC CGCCCTGCCG CGAGATTACG 

701 GACGTTTGGA AGTCGGTACG CGCTGGTTGG GCAACAAACT GACTTTGGGC 
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751 GGCGCGATGC GCTATTTCGG CAAGAGCATC CGCGCGACGG CTGAAGAACG 

801 CTATATCGAC GGCACCAACG GGGGAAATAC CAGCAATTTC CGGCAACTGG 

851 GCAAGCGTTC CATCAAACflA ACCGAAACTC TTGCCCGCCA GCCTTTGATT 

901 TTwGATTTTa ACGCCGCTTA CGAGCCGAAG AAAAACCTTA TTTTCCGCGC 

951 CGAAGTCAAA AATCTGTTCG ACAGGCGTTA TATCGATCCG CTCGATGCGG 

1001 GCAATGATGC GGCAAC.GAG CGTTATTACA GCTCGTTCGA CCCGAAAGAC 

1051 AAGGACrrAG ACGTAACGTG TAATGCTGAT AAAACGTTGT GCaACGGCAA 

1101 ATACGGCGGC ACAAGCAAAA GCGTATTGAC CAATTTTGCA CGCGGACGCA 

1151 CCTTTTTgAT GACGATGAGC TACAAGTTTT AA 



Number 104 ORF 

1 ATGAACCTGA TTTCRCGTTA CATCATCCGT CAAATGGCGG TTATGGCGGT 

51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT 

101 ACGAAACCGG CAACCTCGGC AAAGGCAGTT ACGGCATATG GGAAATGCTG 

151 GGCTACACCG CCCTCAAAAT GCCCGCCCGC GCCTACGAAC TGATTCCCCT 

201 CGCCGTCCTT ATCGGCGGAC TGGTCTCCCT CAGCCAGCTT GCCGCCGGCA 

251 GCGAACTGAC CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG 

301 TTGATTCTGT CGCAGTTCGG TTTTATTTTT GCTATTGCCA CCGTCGCGCT 

351 CGGCGAATGG GTTGCGCCCA CACTGAGCCA AAAAGCCGAA AACATCAAAG 

4 01 CCGCCGCCAT CAACGGCAAA ATCAGCACCG GCAATACCGG CCTTTGGCTG 

4 51 AAAGTiAAAAA ACAGCGTGAT CAATGTGCGC GAAATGTTGC CCGACCAT.. 



Number 105 ORF 

1 ATGAAACTTC TGACCACCGC AATCCTGTCT TCCGCAATCG CGCTCAGCAG 

51 TATGGCTGCC GCCGCTGGCA CGGACAACCC CACTGTTGCA AAAAAAACCG 

101 TCAGCTACGT CTGCCAGCAA GGTAAAAAAG TCAAAGTAAC CTACGGCTTC 

151 AACTiAACAGG GTCTGACCAC ATACGCTTCC GCCGTCATCA ACGGCAAACG 

201 CGTGCAAATG CCTGTCAATT TGGACAAATC CGACAATGTG GAAACATTCT 

251 ACGGCAAAGA AGGCGGTTAT GTTTTGGGTA CCGGCGTGAT GGATGGCAAA 

301 TCCTACCGCA AACAGCCCAT TATGATTACC GCACCTGACA ACCAAATCGT 

351 CTTCAAAGAC TGTTCCCCAC GTTAA 



Number 106 ORF 

1 ..ACACTGTTGT TTGCAACGGT TCAGGCAAGT GCTAACCAAT GAAGAGCAAG 

51 AAGAAGATTT ATATTTAGAC CCCGTACAAC GCACTGTTGC CGTGTTGATA 

101 GTCAATTCCG ATAAAGAAGG CACGGGAGAA AAAGAAAAAG TAGAAGAAAA 

151 TTCAGATTGG GCAGTATATT TCAACGAGAA AGGAGTACTA ACAGCCAGAG 

201 AAATCACCyT CAAAGCCGGC GACAACCTGA AAATCTiAACA AAACGGCACA 

251 AACTTCACCT ACTCGCTGAA AAAAGACCTC AcAGATCTGA CCAGTGTTGG 

301 AACTGAAAAA TTATCGTTTA GCGCAAACGG CAATAAAGTC AACATcACAA 

351 GCGACACCAA AGGCTTGAAT TTTGCGAAAG AAACGGCTGG sACGAACGgC 

401 GACACCACGG TTCATCTGAA CGGTATTGGT TCGACTTTGA CCGATACGCT 

4 51 GCTGAATACC GGAGCGACCA CAAACGTAAC CAACGACAAC GTTACCGATG 

501 ACGAGAAAAA ACGTGCGGCA AGCGTTAAAG ACGTATTAAA CGCTGGCTGG 

551 AACATTAAAG GCGTTAAACC CGGTACAACA GCTTCCGATA ACGTTGATTT 

601 CGTCCGCACT TACGACACAG TCGAGTTCTT GAGCGCAGAT ACGAAAACAA 

651 CGACTGTTAA TGTGGAAAGC AAAGACAACG GCAAGAAAAC CGAAGTTAAA 

701 ATCGGTGCGA AGACTTCTGT TATTAAAGAA AAAGAC. . . 



Number 107 ORF 

1 ..GGCACCGAAT TCAAAACCAC CCTTTCCGGA GCCGACATAC AGGCAGGGGT 

51 GGGTGAAAAA GCCCGAGCC3 AT3CGAAAAT TATCCTAAAA GGCATCGTTA 

101 ACCGCATCCA AACCGAAGAA AA3CTGGAAT CCAACTCGAC CGTATGGCAA 

151 AAGCAGGCCG GAAGCGGCAG CACGGTTGAA ACGCTGAAGC TACCGAGCTT 

201 TGAAGGGCCG GCACTGCCTA AGCTGACCGC TCCCGGCGGC TATATCGCCG 

251 ACATCCCCAA AGGCAACCTC AAAACCGAAA TCGAAAAGCT GGCCAAACAG 

301 CCCGAATATG CCTATCTGAA ACAGCTTCAG ACGGTCAAGG ACGTGAACTG 
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351 GAACCAAGTA CAGCTCGCTT ACGACAAATG GGACTATAAA CAGGAAGGCC 

4 01 TAACCGGAGC CGGAGCCGCA ATTANCGCAC TGGCCGTTAC CGTGGTCACC 

451 TCAGGCGCAG GAACCGGAGC CGTATTGGGA TTAANACGNG TGGCCGCCGC 

501 CGCAACCGAT GCAGCATTT. . . 



Number 108 ORF 

1 . . CGGATCGTTG TAGGTTTGCG GATTTCTTGC GCCGTAGTCA CCGTAGTCCC 

51 AAGTATAACC CAAGGCTTTG TCTTCGCCTT TCATTCCGAT AAGGGATATG 

101 ACGCTTTGGT CGGTATAQCC GTCTTGGGAA CCTTTGTCCA CCCAACGCAT 

151 ATCTGCCTGC GGATTCTCAT TGCCGCTTCT TGGCTGCTGA TTTTTCTGCC 

201 TTCGCGTTTT TCAACTTCGC GCTTGAGGGC TTCGGCATAT TTGTCGGCCA 

251 ACGCCATTTC TTTCGGATGC AGCTGCCTAT TGTTCCAATC TACATTCGCA 

301 CCCACCACAG CACCACCACT ACCACCAGTT GCATAG 



Number 109 ORF 

1 . .AAGTTTGACT TTACCTGGTT TATTCCGGCG GTAATCAAAT ACCGCCGGTT 

51 GTTTTTTGAA GTATTGGTGG TGTCGGTGGT GTTGCAGCTG TTTGCGCTGA 

101 TTACGCCTCT GTTTTTCCAA GTGGTGATGG ACAAGGTGCT GGTACATCGG 

151 GGATTCTCTA CTTTGGATGT GGTGTCGGTG GCTTTGTTGG TGGTGTCGCT 

201 GTTTGAGATT GTGTTGGGCG GTTTGCGGAC GTATCTGTTT GCACATACGA 

251 CTTCACGTAT TGATGTGGAA TTGGGCGCGC GTTTGTTCCG GCATCTGCTT 

301 TCCCTGCCTT TATCCTATTT CGAGCACAGA CGAGTGGGTG ATACGGTGGC 

351 TCGGGTGCGG GAATTGGAGC AGATTCGCAA TTTCTTGACC GGTCAGGCGC 

401 TGACTTCGGT GTTGGATTTG GCGTTTTCGT TTATCTTTCT GGCGGTGATG 

451 TGGTATTACA GCTCCACTCT GACTTGGGTG GTATTGGCTT CGTTG 

// 

1451 

1501 ATTTGCGC 

1551 CAACCGGACG GTGCTGATTA TCGCCCACCG TCTGTCCACT GTTAAAACGG 

1601 CACACCGGAT CATTGCCATG GATAAAGGCA GGATTGTGGA AGCGGGAACA 

1651 CAGCAGGAAT TGCTGGCGAA CG..AACGGA TATTACCGCT ATCTGTATGA 

1701 TTTACAGAAC GGGTAG 



Number 110 ORF 

1 ATGAAATACT TGATCCGCAC CGCCTTACTC GCAGTCGCAG CCGCCGGCAT 

51 CTACGCCTGC CAACCGCAAT CCGAAGCCGC AGTGCAAGTC AAGGCTGAAA 

101 ACAGCCTGAC CGCTATGCGC TTAGCCGTCG CCGACAAACA GGCAGAGATT 

151 GACGGGTTGA ACGCCCAAAk sGACGCCGAA ATCAGA. . . 



Number 111 ORF 

1 ATGGTTATCG GAATATTACT CGCATCAAGC AAGCATGCTC TTGTCATTAC 

51 TCTATTGTTA AATCCCGTCT TCCATGCATC CAGTTGCGTA TCGCGTTsGG 

101 CAATACGGAA TAAAAtCTGC TGTTCTGCTT TGGCTAAATT TGCCAAATTG 

151 TTTATTGTTT CTTTAGGaGC AGCTTGCTTA GCCGCCTTCG CTTTCGACAA 

201 CGCCCCCACA GGCGCTTCCC AAGCgTTGCC TACCGTTACC GCACCCGTGG 

251 CGATTCCCGC GCCCGCTTCG GCAGCCTGA 



Number 112 ORF 

1 ATGTTCAGTA TTTTAAATGT GTTTCTTCAT TGTATTCTGG CTTGTGTAGT 

51 CTCTGGTGAG ACGCCTACTA TATTTGGTAT CCTTGCTCTT TTTTACTTAT 

101 TGTATCTTTC TTATCTTGCT GTTTTTAAGA TTTTCTTTTC TTTTTTCTTA 

151 GACAGAGTTT CACTCCGGTC TCCCAGGCTG GAGTGCAAAT GGCATGACCC 

201 TTTGGCTCAC TGGCTCACGG CCACTTCTGC TATTCTGCCG CCTCAGCCTC 

251 CAGGG... 
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Number 113 ORF 

1 . . GTGCGGACGT GGTTGGTTTT TTGGTTGCAG CGTTTGAAAT ACCCGTTGTT 

51 GCTTTGGATT GCGGATATGT TGCTGTACCG GTTGTTGGGC GGCGCGGAAA 

101 TCGAATGCGG CCGTTGCCCT GTGCCGCCGA TGACGGATTG GCAGCATTTT 

151 TTGCCGGCGA TGGGAACGGT GTCGGCTTGG GTGGCGGTGA TTTGGGCATA 

201 CCTGATGATT GAAAGTGAAA AAAACGGAAG ATATTGA 



Number 114 ORF 

1 ATGTTTCAAA ATTTTGATTT GGGCGTGTTC CTGCTTGCCG TCCTCCCCGT 

51 GCTGCCCTCC ATTACCGTCT CGCACGTGGC GCGCGGCTAT ACGGCGCGCT 

101 ACTGGGGAGA CAACACTGCC GAACAATACG GCAGGCTGAC ACTGAACCCC 

151 CTGCCCCATA TCGATTTGGT CGGCACAATC ATCgTACCGC TGCTTACTTT 

201 GATGTTCACG CCCTTCCTGT TCGGCTGGGC GCGTCCGATT CCTATCGATT 

251 CGCGCAACTT CCGCAACCCG CGCCTTGCCT GGCGTTGCGT TGCCGCGTCC 

301 GGCCCGCTGT CGAATCTAGC GATGGCTGTw CTGTGGGGCG TGGTTTTGGT 

351 GCTGACTCCG TATGTCGGCG GGGCGTATCA GATGCCGTTG GCTCAAATGG 

4 01 CAAACTACGG TATTCTGATC AATGCGATTC TGTTCGCGCT CAACATCATC 

451 CCCATCCTGC CTTGGGACGG CGGCATTTTC ATCGACACCT TCCTGTCGGC 

501 GAAATATTCG CAAGCGTTCC GCAAAATCGA ACCTTATGGG ACGTGGATTA 

551 TCCTACTGCT GATGCTGACC sGGGTTTTGG GTGCGTTTAT wGCACCGATT 

601 sTGCGGmTGc GTGATTGCrT TTGTGCAGAT GTwCGTCTGA CTGGCTTTCA 

651 GACGGCATAA 



Number 115 ORF 

1 ATGAACCTGA TTTCACGTTA CATCATCCGT CJyUVTGGCGG TTATGGCGGT 

51 TTACGCGCTC CTTGCCTTCC TCGCTTTGTA CAGCTTTTTT GAAATCCTGT 

101 ACGAAACCGG CAACCTCGGC AA.riGGCAGTT ACGGCATATG GGAAATGCTG 

151 GGCTACACCG CCCTCAAAAT GCCCGCCCGC GCCTACGAAC TGATTCCCCT 

201 CGCCGTCCTT ATGGGCGGAC TGGTCTCCCT CAGCCAGCTT GCCGCCGGCA 

251 GCGAACTGAC CGTCATCAAA GCCAGCGGCA TGAGCACCAA AAAGCTGCTG 

301 TTGATTCTGT CGCAGTTCGG TTTTATTTTT GCTATTGCCA CCGTCGCGCT 

351 CGGCGAATGG GTTGCGCCCA CACTGAGCCA AAAAGCCGAA AACATCAAAG 

401 CCGCCGCCAT CAACGGCAAA ATCAGCACCG GCAATACCGG CCTTTGGCTG 

451 AAAGAAAAAA ACAGCGTGAT CAATGTGCGC GAAATGTTGC CCGACCAT. . 



Number 116 ORF 



1 


. . GCAGTAGCCG 


AAACTGCCAA 


CAGCCAGGGC 


AAAGGTAAAC 


AGGCAGGCAG 


51 


TTCGGTTTCT 


GTTTCACTGA 


AAACTTCAGG 


CGACCTTTGC 


GGCAAACTCA 


101 


AAACCACCCT 


TAAAACTTTG 


GTCTGCTCTT 


TGGTTTCCCT 


GAGTATGGTA 


151 


TTGCCTGCCC 


ATGCCCAAAT 


TACCACCGAC 


AAATCAGCAC 


CTAAAAACCA 


201 


GCAGGTCGTT 


ATCCTTAAAA 


CCAACACTGG 


TGCCCCCTTG 


GTGAATATCC 


251 


AAACTCCGM 


TGGACGCGGA 


TTGAGCCACA 


ACCGCTA.TA 


CGCATTTGAT 


301 


GTTGACAACA 


AAGGGGCAGT 


GTTAAACAAC 


GACCGTAACA 


ATAATCCGTT 


351 


TGTGGTCAAA 


GGCAGTGCGC 


AATTGATTTT 


GAACGAGGTA 


CGCGGTACGG 


401 


CTAGCAAACT 


CAACGGCATC 


GTTACCGTAG 


GCGGTCAAAA 


GGCCGACGTG 


451 


ATTATTGCCA 


ACCCCAACGG 


CATTACCGTT 


AATGGCGGCG 


GCTTTAAAAA 


501 


TGTCGGTCGG 


GGCATCTTAA 


CTACCGGTGC 


GCCCCAAATC 


GGCAAAGACG 


551 


GTGCACTGAC 


AGGATTTGAT 


GTGgGTCAAG 


GCACATTGgA 


CCGTAGrAGC 


601 


AGCAGGTTGG 


AATGATAAAG 


GCGGAGCmm 


yTACACCGGG 


GTACTTGCTC 


651 


GTGCAGTTGC 


TTTGCAGGGG 


AAATTwmmGG 


GTAAA.AACT 


GGCGGTTTCT 


701 


ACCGGTCCTC 


AGAAAGTAGA 


TTACGCCAGC 


GGCGAAATCA 


GTGCAGGTAC 


751 


GGCAGCGGGT 


ACGAAACCGA 


CTATTGCCCT 


TGATACTGCC 


GCACTGGGCG 


801 


GTATGTACGC 


CGACAGCATC 


ACACTGATTG 


CCAATGAAAA 


AGGCGTAGGC 


851 


GTCTAA 
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Number 1 17 ORF 

1 . . CGCTTCATTC ATGATGflAGC AGTCGGCAGC AACATCGGCG GCGGCAAAAT 

51 GATTGTTGCA GCCGGGCAGG ATATCAATGT ACGCGGCAnA AGCCTTATT? 

101 CTGATAAGGG CATTGTTTTA AAAGCAGGAC ACGACATCGA TATTTCTACT 

151 GCCCATAATC GCTATACCGG CAATGAATAC CACGAGAGCA wAAAwTCAGG 

201 CGTCATGGGT ACTGGCGGAT TGGGCTTTAC TATCGGTAAC CGGAAAACTA 

251 CCGATGACAC TGATCGTACC AATATTGTsC ATACAGGCAG CATTATAGGC 

301 AGCCTGAaTG GAGACACCGT TACAGTTGCA GGAAACCGCT ACCGACAAAC 

351 CGGCAGTACC GTCTCCAGCC CCGAGGGGCG CAATACCGTC ACAGCCAAAw 

4 01 GCATAGATGT AGAGTTCGCA AACAACCGGT ATGCCACTGA CTACGcCCAT 

4 51 ACCCAgGGAA CAAAAAGGCC TTACCGTCGC CCTCAATGTC CCGGTTGTCC 

501 AAGCTGCACA AAACTTCATA CAAGCAGCCC AAAATGTGGG CAAAAGTAAA 

551 AATAAACGCG TTAATGCCAT GGCTGCAGCC AATGCTGCAT GGCAGAGTTA 

601 TCAAGCAACC CAACAAATGC AACAATTTGC TCCAAGCAGC AGTGCGGGAC 

651 AAGGTCAAAA CTACAATCAA AGCCCCAGTA TCAGTGTGTC CATTAC . TAC 

7 01 GGCGAACAGA AAAGTCGTAA CGAGCAAAAA AGACATTACA CCGAAgCGGC 

751 AgCAAGTCAA ATTATCGGCA AAGGGCAAAC CACACTTGCG GCAACAGGAA 

801 GTGGGGAGCA GTCCAATATC /iATATTACAG GTTCCGATGT CATCGGCCAT 

851 GCAGGTACTC C.CTCATTGC CGACAACCAT ATCAGACTCC AATCTGCCAA 

901 ACAGGACGGC AGCGAGCAAA GCAAAAACA.:^ AAGCAGTGGT TGGAATGCAG 

951 GCGTACGTnn CAAAATAGGC AACGGCATCA GGTTTGGAAT TACCGCCGGA 

1001 GGAAATATCG GTAAAGGTAA AGAGCAAGGG GGAAGTACTA CCCACCGCCA 

1051 CACCCATGTC GGCAGCACAA CCGGCAAAAC TACCATCCGA AGCGGCGGGg 

1101 GATACCACCC TCAAAGGTGT GCAGCTCATC GGCAAAGGCA TACAGGCAGA 

1151 TACGCGCAAC CTGCATATAG AAAGTGTTCA AGATACTGAA ACCTATCAGA 

1201 GCAAACAGCA AAACGGCAAT GTCCAAGTTi ACTGTCGGTT ACGGATTCAG 

1251 TGCAAGCGGC AGTTACCGCC AAAGCAAAGT CAAAGCAGAC CATGCCTCCG 

1301 TAACCGGGCA AAgCGGTATT TATGCCGGAG AAGACGGCTA TCAAATyAAA 

1351 GTyAGAGACA ACACAGACCT yAAGGGCGGT ATCATCACGT CTAGCCAAAG 

1401 CGCAGAAGAT AAGGGCAAAA ACCTTTTTCA GACGGCCACC CTTACTGCCA 

1451 GCGACATTCA AAACCACAGC CGCTACGAAG GCAGAAGCTT CGGCATAGGC 

1501 GGCAGTTTCG ACCTGAACGG CGGCTGGGAC GGCACGGTTA CCGACAAACA 

1551 AGGCAGGCCT ACCGACAGGA TAAGCCCGGC AGCCGGCTAC GGCAGCGACG 

1601 GAGACAGCAA AAACAGCACC ACCCGCAGCG GCGTCAACAC CCACAACATA 

1651 CACATCACCG ACGAAGCGGG ACAACTTGCC CGAACAGGCA GGACTGCAAA 

1701 AGAAACCGAA GCGCGTATCT ACACCGGCAT CGACACCGAA ACTGCGGATC 

1751 AACACTCAGG CCATCTGAAA AACAGCTTCG AC. . . 



Number 118 ORF 

1 ..ACGACCGGCA GCCTCGGCGG CATACTGGCC GGCGGCGGCA CTTCCCTTGC 

51 CGCACCGTAT TTGGACAAAG CGGCGGAAAA CCTCGGTCCG GCGGGCAAAG 

101 CGGCGGTCAA CGCACTGGGC GGTGCGGCCA TCGGCTATGC AACTGGTGGT 

151 AGTGGTGGTG CTGTGGTGGG TGCGAATGTA GATTGGAACA ATAGGCAGCT 

201 GCATCCGAAA GAAATGGCGT TGGCCGACAA ATATGCCGAA GCCCTCAAGC 

251 GCGAAGTTGA AAAACGCGAA GGCAGAAAAA TCAGCAGCCA AGAAGCGGCA 

301 ATGAGAATCC GCAGGCAGAT ATGCGTTGGG TGGACAAAGG TTCCCAAGAC 

351 GGCTATACCG ACCAAAGCGT CATATCCCTT ATCGGAATGA 



Number 119 ORF 

1 . . CAATGCCGTC TGAAAAGCTC ACAATTTTAC AGACGGCATT TGTTATGCAA 

51 GTACATATAC AGATTCCCTA TATACTGCCC AGrkGCGTGC GTgGCTGAAG 

101 ACACCCCCTA CGCTTGCTAT TTGrAACAGC TCCAAGTCAC CAAAGACGTC 

151 AACTGGAACC AGGTACwACT GGCGTACGAC .AAATGGGACT ATAAACAGGA 

201 AGGCTTAACC GGAGCCGGAG CAGCGATTAT TGCGCTGGCT GTTACCGTGG 

251 TTACTGCGGG CGCGGGAgCC GGAGCCGCAC TGGGcTTAAA CGGCGCGGCc 

301 GCAGCGGCAA CCGATGCCGC ATTCGCCTCG CTGGCCAGCC AGGcTTCCGT 

3 51 ATCGCTCATC AaCAACAAAG GCAATATCGG TAaCACCCTG AAAGAGCTGG 

4 01 GCAGAAGCAG CACGGTGAAA AATCTGATGG TTGCCGTCGc tACCGCAgGC 
4 51 GTagCcgaCA AAATCGGTGC TTCGGCACTG AACAATGTCA GCGATAAGCA 
501 GTGGATCAAC AACCTGACCG TCAACCTGGC CAATGCGGGC AGTGCCGCAC 
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551 TGATTAATAC CGCTGTCAAC GGCGGCAGCc tgAAAGACAA TCTGGAAGCG 

601 AATATCCTTG CGGCTTTGGT GAATACTGCG CATGGAGAAG CAGCCAGTRA 

651 AATCAAACAG TTGGATCAGC ACTACATTAC CCACAAGATT GCCCaTGCCA 

701 TAGCGGGCTG TGCGGcTGCG GCGGCGAATA AGGGCAAGTG TCAGGATGGT 

751 GCGATAgGTG CGGCTGTGGG CGAGATAGTC GGGGAgGCTT TGACAAACGG 

801 CAAAAATCCT GACACTTTGA CAGCTAAAgA ACGCGaACAG ATTTTGGCAT 

851 ACAGCAAACT GGTTGCCGGT ACGGTAAGCG GTGTGGTCGG CGGCGATGTA 

901 AATGCGGCGG CGAATGCGGC TGAGGTAGCG GTGAAAAATA ATCAGCTTAG 

951 CGACAAAtGA 



Number 120 ORF 

1 ATGGCAATCA TTACATTGTA TTATTCTGTC AATGGTATTT TAAATGTATG 

51 TGCAAAAGCA AAAAATATTC AAGTAGTTGC CAATAATAAG AATATGGTTC 

101 TTTTTGGGTT TTTGGsmrGC ATCATCGGCG GTTCAACCAA TGCCATGTCT 

151 CCCATATTGT TAATATTTTT GCTTAGCGAA ACAGAAAATA AAAATcgTAT 

201 CGTAAAATCA AGCAATCTAT GCTATCTTTT GGCGAAAATT GTTCAAATAT 

251 ATATGCTAAG AGACCAGTAT TGGTTATTAA ATAAGAGTGA ATACGdTTTA 

301 ATATTTTTAC TGTCCGTATT GTCTGTTATT GGATTGTATG TTGGAATTCG 

351 GTTAAGGACT AAGATTAGCC CAaATTTTTT TAAAATGTTA ATTTTTATTG 

401 tTTTATTGGT ATTGGCtCTG AAAATCGGGC AttCGGGTTT AAtCAAACTT 

451 TAA 



Number 121 ORF 

1 ATGTTACGTt TGACTGCtTT AGCCGTATGC ACCGCCCTCG CTTTGGGCGC 

51 GTGTTCGCCG CAAAATTCCG ACTCTGCCCC ACAAGCCAAA GaACAGGCGG 

101 TTTCCGCCGC ACAAACCGAA GgCGCGTCCG TTACCGTCAA AACCGCGCGC 

151 GGCGACGTTC AAATACCGCA AAACCCCGAA CGCATCGCCG TTTACGATTT 

201 GGGTATGCTC GACACCTTGA GCAAACTGGG CGTGAAAACC GGTTTGTCCG 

251 TCGATAAAAA CCGCCTGCCG TATTTAGAGG AATATTTCAA AACGACAAAA 

301 CCTGCCGGCA CTTTGTTCGA GCCGGATTAC GAAACGCTCA ACGCTTACAA 

351 ACCGCAGCTC ATCATCATCG GCAGCCGCGC CgCCAAGGCG TTTGACAAAT 

401 TGAAcGAAAT CGCGCCGACC ATCGrmwTGA CCGCCGATAC CGCCAACCTC 

4 51 AAAGAAAGTG CCAArGAGGC ATCGACGCTG GCGCAAATCT TO. . 



Number 122 ORF 

1 ATGAGACATA TGAAAATACA AAATTATTTA CTAGTATTTA TAGTTTTACA 

51 TATAGCCTTG ATAGTAATTA ATATAGTGTT TGGTTATTTT GTTTTTCTAT 

101 TTGATTTTTT TGCGTTTTTG TTTTTTGCAA ACGTCTTTCT TGCTGTAAAT 

151 TTATTATTTT TAGAAAAAAA CATAAAAAAC AAATTATTGT TTTTATTGCC 

201 GATTTCTATT ATTATATGGA TGGTAATTCA TATTAGTATG ATAAATATAA 

251 AATTTTATAA ATTTGAGCAT CAAATAAAGG AACAAAATAT ATCCTCGATT 

301 ACTGGGGTGA TAAAACCACA TGATAGTTAT AATTATGTTT ATGACTCAAA 

351 TGGATATGCT AAATTAAAAG ATAATCATAG ATATGGTAGG GTAATTAGAG 

4 01 AAACACCTTA TATTGATGTA GTTGCATCTG ATGTTAAAAA TAAATCCATA 

4 51 AGATTAAGCT TGGTTTGTGG TATTCATTCA TATGCTCCAT GTGCCAATTT 

501 TATAAAATTT GTCAGG. . 



Number 123 ORF 

1 ..ACCCCCAACA GCGTGACCGT CTTGCCGTCT TTCGGCGGAT TCGGGCGTAC 

51 CGGCGCGACC ATCAATGCAG CAGGCGGGGT CGGCATGACT GCCTTTTCGA 

101 CAACCTTAAT TTCCGTAGCC GAGGGCGCGG TTGTAGAGCT GCAGGCCGTG 

151 AGAGCCAAAG CCGTCAATGC AACCGCCGCT TGCATTTTTA CGGTCTTGAG 

201 TAAGGACATT TTCGATTTCC TTTTTATTTT CCGTTTTCAG ACGGCTGACT 

251 TCCGCCTGTA TTTTCGCCAA AGCCATGCCG ACAGCGTGCG CCTTGACTTC 

301 ATATTTAAAA GCTTCCGCGC GTGCCAGTTC CAGTTCGCGC GCATAGTTTT 

351 GAGCCGACAA CAGCAGGGCT TGCGCCTTGT CGCGCTCCAT CTTGTCGATG 

401 ACCGCCTGCA GCTTCGCAAA TGCCGACTTG TAGCCTTGAT GGTGCGACAC 
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451 AGCCAAGCCC GTGCCGACAA GCGCGATAAT GGCAATCGGT TGCCAGTAAT 
501 TCGCCflGCAG TTTCACGAGA TTCATTCTCG ACCTCCTGAC GCTTCACGCT 
551 GA 
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APPENDIX C 

The following partial DNA sequence was identified in N. meningitidis <SEQ ID 1>: 
gnm_l 

5 GAAAATTCAGCAGCAGCAGGAAGATTGCCAGCATTTGCGCGGCGGTTTTAAACTTACCGA 
CGGTGGCGACGGCAACGCTGTTCCTTTTGCCCATTTGCGCCATCCATTCGCGCAATGCGG 
AAATGGTAATTTCCCTGCCGATGATGATCATGGCAAACAAAACATAGGTCCGGTCGAGTT 
TGACCAGTAAAAGCAAAGAGACGGCGACCATCAGCTTGTCGGCAACGGGATCGAGGAAGG 

CGCCGAAATCCGAGGTCTGTTTCCACAACCTTGCCAAAAATCCGTCAAACCAGTCGGTCA 
10 AGGCGGCAACGGCAAAAATGACGGCGGCGGTGAGATTAATCGTTTCCTCCGCGAACCACG 
GAAAAGGCAGGTAAAAAAGGGCTGTCAGGACAGGAATGAGCAAGACCCTCAACCATGTGA 

ggaagatggggagattccaaggcatcggttttctctgtgcagactgtaaagttgtgatta 
taacggttatcctcataacccaaaacgtaaaattgctgcatgggcattcccccgccccgc 
caatctgttttcacattcttttcaaacgcaggaaaatggcgggcaataaaagcaaaatac 

1 5 ccagtttcaggctgaaaacggcaggttgtgccaacacttcgacaaggcggtcttccgtgc 
gggctuvaatctttattgcttatagacactgccactgttgcggtattccaacagaacgccg 
tttaaaaaacctttgccgacggtttcgcttaaaacggctctaacctgctccgccctgatg 
gttctgccgatattgccgcctgtgcacaaactgtcgaacccatagcaggaaagccggtaa 
tgctgcccgtctgcatccagtttgattgcccgtccgctgcggttgagggcggtaacggtc 

20 aattccgcatattcgaatgttttttcttgttcgtgaaatgccgtcaggtaaggtgcaata 
aaaacggcggacaacagcagacagcttatggcggcaaaccatacccagcgataatatagt 
ggattaaatttaaaccagtacagcgttgcctcgccttagctcaaagagaacgattctcta 
aggtgctgaagcaccaagtgaatcggttccgtactatttgtactgtctgcggcttcgtcg 
ccttgtcctgatttaaatttaatccactatatttcacgcttaccccttgtttctcaaatg 

25 ccgtctgaaataagcggcttaatatattgtttacagtattgggaagcataacagAcaaaa 
tgccgtctgaaatattttcagacggcatttcttatccgaaacggattatttttgcgtttc 
aaccgcttccaatgcacgcagggcataagtgtaagcggcacccgcattcagggcaatggc 
ggttgccaatgcacctgcgatttcgctgtcggtcgcaccggctttggtggcggcggcggc 
gtgaacactgatgcagctctcacpacgtgtagtaatggcaacggcgatggcaatcagttc 

30 gcgtgttttagcatcaagtgcctctgcagctgccgcttgttccaatgcgccgtaggcctg 
cagcattttaggatgcgccttac::cagctcgccgaacgattttttaaccaatgcggtatg 
ttctttccaatctttaaacattttcttttcctttctcttgcgtttaaccctgatacgcgc 
ttgcgtatctgttttcgatgtgcgtattattgcaattattcagttgtgtttctcgtttaa 
tcatctcattttatggttcaaaaagatttatggacattctggacaaactggtcgatttcg 

35 cccaattgacgggcagtgtggatgtgcagtgccttttgggcggacaatggtcggtacggc 
atgaaaccttgctu^cgcgaaggattggtacacattgttacatcgggcagcggctatctct 
gcatcgacggcgaaacttccccgcgtccggtcagtacaggggatattgtatttttcccgc 
gcggcttgggtcatgtgttgagccacgacggaaaatgcggagaaagtttacaaccggata 
tgcggcagcacggtgcgtttacggtcaagcagtgcggcaacggacaggatatgagcctgt 

40 tttgcgcccgtttccgctacgacacccacgccgatttgatgaacgggctgcctgaaaccg 
tttttctgaacattgcccatccgagtttacagtatgtggtttcaatgctgcaactggtw^ 
gcaaaaaacctttgacggggacggtttccatggtcaacgcattgtcgtccgtcctgctgg 
tgcttatcctgcgcgcctatctcgaacaggataaggatgtcgaactctcgggcgtattga 
aaggttggcaggacaaacgtttgggacatttaatccaaaaggtgatagacaaaccggaag 

45 acgaatggaatgtcgacaaaatggtggcggctgccaatatgtcgcgcgcgcaactgatgc 
gccgtttcaaaagccgggtcggactcagcccgcacgcctttgtgaaccatatccgcctgc 
aaaaaggcgcgttgctgctgaaaaaaaacccggattcggttttgtcggtcgcactgtcgg 
taggctttcagtcggaaacgcacttcggcaaggcgttcaaacggcaatatcacgtttcgc 
cgggtcaataccggaaagaagcgggcaaaaataaatcggggcttcaaacgcaaatgccgt 

50 ctgaaaaggctttcatacagcatttgcgtaccgcgtcatttcaagggctgcatcttcatc 
acttccatcaaaaagttggtaaatgcggggttgttgggtttgacatccatatttttccaa 
cgctgctgccagccgcgcaaggcattctggatatacagcttggactgttccgtattgatt 
gcgccccgctggctgtctatcgccgaacgcaggtagatttcatacatactgtcatcgacg 
gcattgcgtccgaccaggcgttttctgaagttgttcagatattgcgccgcctgaaccttg 



wo 00/22430 



PCTAJS99/23573 



GTCATTTTACCGATACCCACCTGATAGCCCAAGCGCGTCGCTTCATCGCTGATTTTGGCA 
ACATCCGTCCAATGCGAAGAGGCAAGGCGGAAACCTTTTGCAGGTGCTTCCGTTTTGACG 
GTATTGATAGGATTCACGGGGATTTCCGTCAATGTGGGCACATAAATAGACTGGCAGCCG 
GAAAGAACTGCCGCAATGGAAAGAGGGATAAGGTATTTTTTCATGCCCCCATTATAATCA 
5 AGTTTGCCTTGAGAAAACAAATTGTTCGGCAAGAAAAATAAAATTTCGGCATCAGAAGCA 
GGCAAAAACACATTCCACAAGCCTTGCCGCAAGGTTTACAATCCGACCGTCCTTATCGCA 
ACGACCGTTTATGGATACCGCAAAAAAAGACATTTTAGGATCGGGCTGGATGCTGGTGGC 
GGCGGCCTGCTTTACCATTATGAACGTATTGATTAAAGAGGCATCGGCAAAATTTGCCCT 
CGGCAGCGGCGAATTGGTCTTTTGGCGCATGCTGTTTTCAACCGTTGCGCTCGGGGCTGC 
10 CGCCGTATTGCGTCGGGACAmCTTCCGCACGCCCCATTGGAAAAACCACTTAAACCGCAG 
TATGGTCGGGACGGGGGCGATGCTGCTGCTGTTTTACGCGGTAACGCATCTGCCTTTGGC 
CACTGGCGTTACCCTGAGTTACACCTCGTCGATTTTTTTGGCGGTATTTTCCTTCCTGAT 
TTTGAAAGAACGGATTTCCGTTTACACGCAGGCGGTGCTGCTCCTTGGTTTTGCCGGCGT 
GGTATTGCTGCTTAATCCCTCGTTCCGCAGCGGTCAGGAAACGGCGGCACTCGCCGGGCT 
1 5 GGCGGGCGGCGCGATGTCCGGCTGGGCGTATTTGAAAGTGCGCGAACTGTCTTTGGCGGG 
CGAACCCGGCTGGCGCGTCGTGTTTTACCTTTCCGTGACAGGTGTGGCGATGTCGTCGGT 
TTGGGCGACGCTGACCGGCTGGCACACCCTGTCCTTTCCATCGGCAGTTTATCTGTCGTG 
CATCGGCGTGTCCGCGCTGATTGCCCAACTGTCGATGACGCGCGCCTACAAAGTCGGCGA 
CAAATTCACGGTTGCCTCGCTTTCCTATATGACCGTCGTTTTTTCCGCTCTGTCTGCCGC 
20 ATTTTTTCTGGGCGAAGAGCTTTTCTGGCAGGAAATACTCGGTATGTGCATCATCATCCT 
CAGCGGTATTTTGAGCAGCATCCGCCCCACTGCCTTCAAACAGCGGCTGCAATCCCTGTT 
CCGCCTIAAGATAAAAAATGCCGTCCGAACATCCTTCAGACGGCATATCGGGCTTTATTTC 
CCCGCCTTCACATCCTGCCACTGGCGCACCATAAACTTCAATGCCGCCGGCTGGATAGGC 
ACCATGATAAAGCTGTTTTTCAAATCCTCCTCGGTTGGGAAAATCGTATTGTCGTTTTTA 
25 AATTCGTCTTCCATCAGCTCACGCGCAGGCTTGCTCGAAGGCGCGTAAGTAACGAAATTG 
CCGTTTTTCGCCGACACTTCCGGGTCGAGGAAGTCGTTGATGTATTTGTGCGCGTTGGCG 
ACGTTTTTCGCATCTTTCGGAATCACGAAAGAATCCACCCAAATCCCCACGCCCTCTTTG 
GGCATCATCACGCGGATTTTTTCCTTGCCGCCCGCTTCTTCGGCACGGCGTTTGGCGATG 
TTCAAATCGCCGCCGAAACCGATTGTTACGCAGGTATCGCCGCGCGCCAAATCATCGATA 
30 AAGCCGGACGAAGTAAAGCGTTTGATATTGGGGCGGTTTTTCTTGAGTAGGGCGGTTGCC 
TCCCTGATGTCTTCCGTATTGCTGCTGTTCGGGTTTTTACCCAAATAGTTCAACACCATA 
GGATAGATTTCCGCCGCGCTGTCCAAATAGCTGATGCCGCATTGCTTGAGTTTGGACGTG 
TATTCGGGGTCGAACACCAAATCCCACTGGTTGTCCGGCAGCTTGTCCGTACCCAAAGCC 
TTTTTCACGCGTTCGGTATTGATGGCGAAGGTATTTGTCCCCCAATAAAACGGCACGGCG 
35 TATTCGTGGCCGGGATCGACCCCGTCCATCAGCCTCATCATTTCGGGGTTGAGGTGTTTA 
TAATTGGGAATCAGCGACTTATCGATTTTCTGATACGCACCTGCCTTAATCTGCCTGCCC 
ACAAACGCATTGGACGGCGCGACAATGTCGTAACCGGACTTGCCTGTCAGCACCTTGCTT 
TCCAGCGTTTCATCGCTGTCGTACACATCATAAGTAACCTTGATGCCGTTTTTCTTTTCA 
AAATCGGCAACGGTTTCCGGATCGACATATTCCGACCAGTTGTAAATTTTCAATACGTTT 
40 TGGTTTTCCGCCGGTGCCGGTTTTTCGGCAGGCGGTTTGTCCGAACCGCCGCACGCTGCA 
AGCAGCAAAGCAGTCAGGACGGCCAGGGGCAGATGTTTGGTCATTATCATTCCTTGCATA 
TCGGGTTGGAGAAAGCGGCCATTATAGCCGATATTGGCAACAGGGCTTCAGACGGCATTC 
AAAATCCCGCCACACTCTTCCGAAAACCGCCGCTTCCATAGCTAGAAACAGGGATTTGCG 
GTAAGATACCGCCGTTCGTTTTCCCTGCTTTTACCATGACAAGACATTTGAGAGACATTG 
45 AAAAAATTATGAAAACCTCCGAACTGCGCCAAAAATTCCTAAAATTTTTTGAAACCAAAG 
GCCACACCGTCGTCCGCTCTTCCAGCCTCGTGCCGCACGACGACCCGACCCTGCTGTTTA 
CCAACGCGGGCATGAACCAGTTTAAAGACGTATTCTTAGGTTTCGACAAACGCCCGTACA 
GCCGCGCCACCACCGCGCAAAAATGCGTACGCGCAGGCGGCAAACACAACGACTTGGAAA 
ACGTCGGCTACACCGCCCGCCACCACACCTTCTTTGATyVTGATGGGCAACTTCTCCTTCG 
50 GCGACTACTTCAAACGCGACGCCATCCACTTCGCTTGGGAATTTCTGACTTCCCCCGAAT 
GGCTCAACATCCCTAAAGACAAACTGTTGGCGACCGTTTACGCGGAAGACGACGAAGCCT 
ACAACATCTGGTTGAACGAAATCGGTATGCCGTCCGAGCGCATCGTCCGCATCGGCGACA 
ACAAAGGCGCGAAATACGCATCCGACAACTTCTGGCAAATGGGCGACACCGGCCCTTGCG 
GCCCCTGCTCCGAAATTTTCTACGACCACGGCGAAGAAATCTGGGGCGGCATTCCCGGCA 
55 GTCCCGAAGAAGACGGCGACCGCTGGATCGAAATTTGGAACTGCGTATTTATGCAGTTCA 
ACCGCGACGAACAAGGCAATATGAACCCGCTTCCCAAACCTTCCGTCGATACCGGTATGG 
GCTTGGAACGCATAGCCGCCGTCATGCAGCATGTTCACAGCAACTACGAAATCGACTTGT 
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TCCAAGACCTGCTCAAAGCCGTTGCCCGCGAAACCGGCGCGCCGTTCAGAATGGAAGAAC 
CCAGCCTGAAAGTCATCGCCGACCACATCCGCTCCTGCTCGTTCCTGATTGCAGACGGCG 
TCTTGCCTTCCAACGAAGGCCGCGGCTAGGTATTGCGCCGCATTATCCGCCGCGCCGTGC 
GCCACGGTTACAAACTGGGTCAAAGCAAACCGTTCTTCCACAAACTCGTTGCCGATTTGG 
5 TCAAAGAGATGGGCGGTGCCTACCCTGAATTGAAAGAAAAACAAGCCCAAATCGAAGAAG 
CATTGAAAAACGAAGAAAGCCGTTTTGCCCAAACGCTGGAAACCGGTATGGCTTTGTTGG 
AAAACGCGCTGGTyAAAGGCGGCAAAACACTCGGCGGCGAAATCATCTTCAAACTCTACG 
ATACCTACGGTTTCCCATACGACTTGACTGCCGACATCTGCCGCGAACGCAATATCGAAC 
CGGACGAAGCAGGCTTCGAGCGCGAAATGGAAGCCCAACGCGCACGCGCACGCGCCGCCC 
10 AAAGCTTCAAAGCCAACGCCCAACTGCCTTATGACGGTCAAGACACCGAGTTTAAAGGTT 
ATAGCGAACGCCAAACCGAATCCAAAGTCCTCGCCCTCTACAAAGACGGCGAGCAAGTCA 
ACGAATTGAACGAAGGCGACAGCGGCGCAGTCGTCATCGACTTTACCCCGTTCTATGCAG 
AATCCGGCGGCCAAGTCGGCGATGTCGGCTATATCTTCTCAGGCGAAAACCGCTTTGAAG 
TACGCGATACCCAAAAAATCAAAGCGGCCGTATTCGGTCAATTCGGCGTACAAACTTCAG 
1 5 GCCGTCTGAAAGTCGGCGACAGCGTTACCGCCAAAGTGGACGACGAAATCCGCAATGCCA 
ATATGCGCAACCACAGCGCAACCCACTTGATGCACAAAGCCCTGCGCGATGTATTGGGCA 
GACACGTCGAACAAAAAGGCTCTTTGGTTACCGCCGAATCCACCCGTTTCGACATTTCCC 
ATCCCCAAGCGGTAACTGCCGAAGAAATTGCCGAAGTAGAACGCCGCGTCAACGAAGCCA 
TTTTGGCGAACGTTGCCGTCAATGCAGCCATTATGAGCATGGAAGACGCGCAAAAAACCG 
20 GCGCGATGATGCTCTTCGGCGAAAAATACGGCGAAGAAGTGCGCGTACTGCAAATGGGCG 
GTTTCTCTACCGAATTGTGCGGCGGCACACACGTTTCACGCACCGGCGACATCGGCCTCT 
TCAAAATCATCAGCGAAGGCGGTATTGCCGCAGGCGTGCGCCGTATCGAAGCCATCACCG 
GCCTGAACGCACTCAAATGGGCGCAAGAGCAA.GAGCGTTTGGTGAAAGACATTATTGCCG 
AAACCAAAGCCCAAACCGAAAAAGACGTACTGGCAAAAATCCAAGCAGGCGCGGCACACG 
25 CCAAAGCATTGGAAAAAGAATTGGCACGCGCCAAAGCCGAACTCGCCGTCCACGCAGGCG 
CCAAACTCTTGGACGATGCAAAAGACTTGGGCGCAGCCAAACTCGTTGCCGCCCAAATCG 
AAGCCGACGCAGCCGCCCTGCGCGAAATCGTTACCGATTTAACCGGTAMTCCGACAACG 
CCGTGATTCTTTTAGCGGCAGTAAACGACGGCAAAGTCTCCCTGTGCGCCGGCGTATCCA 
AACCGTTGACCGGCAAAGTGAAAGCAGGCGATCTGGTTAAATTTGCAGCCGAACAAGTCG 
30 GCGGCAAAGGCGGCGGCAGACCAGATTTGGCGCAAGCCGGCGGCACGGATGCCGACAAAT 
TGCCCGCCGTGTTGGATAGCGTGAAAGACTGGGTCGGCGCGAAGCTGGTTTGATGTGGGA 
AAGGCAGCCTGAAAGGTTTCAGGCTGCCTTTTGTGCAAAGAGGCCGTCTGAAAGGTCTCG 
TTTGCCGTAGGTTGGGTCGCGACCCAACAAATTTTGTGAAGTATAAAA.'\TGTTGGTCATG 
ACCCAACCTACCTGCCTTTTTGTACAAAGAGGCTATCTGAAAGGCCTTGTTTGCCGTATG 
35 GTGGGTCGCGACCCAGCAGATTTTTATTAGGGTATGACCCAAGCTACTTGCTACGATAAA 
AAAGGATTTTTAAATGAGCATTAGCCTTATTGGACTACACATTACCATAGCAATCATTTT 
GTTTTTTACTACAAATTTTATGGGAAAAAAATCATCTATATTTGGCTATTACCAACTGTC 
TTTTAGCGAAGAAAATCACTCTCCGGCATTTAATATTTTTTACAGAGCATTTACCCCTAT 
ATTATTTATCGTTATTTTTTCTTGGGTTGTTACTAGTCTTGAAATTCCCATTTCTCTTGA 
40 AAAGATAAACTATGTAGTAATTTATTATTTTATAATTAGATTGTTATCTGTATTTGTTTT 
TGAGAAAACACACATAGTTAACTGGTTTAATC/iACTAACAATACCCATACTATCCATAAC 
ATTATCATTTATAGTATATAACAAAATGATTTTGCCCAAAAGTTTTCTACTTCCATCCTC 
ACAAG/^GTAGCTACTACTTTTTGAATAGCGCTTGGTGGTTACATATATAATATATTAAA 
TAATGAATCAGGGCATTTAAAATCTTATAAAGAAAGAAGAGTAAATTATGTAAAACACAT 
45 GCACAAAAAATTTGAAAGTTATTTTGGTAAAATTATAGATAAAATAATCAAAGAGGATAG 
TTATAATAATGATGATTTTTTAACCGATAAGAAAAAAGCACTAATATATTCAGTTTTAAT 
TTATGAGAATTTTAATAGGGGACTAGTTTATAGATATTTTGAAAAAAATTATTTTGTACT 
GGTAGAATAAAAACATTTGGAATAATGCAAGTAACCTCAGCAGAGTACCTTTCCAATGAG 
GAAAGTATAAAAAAAGGCGGAAATATTCTTATGGAAAAATACAATGAAAAATATAATGAA 
50 TCTATTGATGGCAATAAAACTCTCTATAAATCATATTATGAATCAAGAAGAGAGAGTATT 
AAAAACTACAACCCAGATGCAAAATACATTAATGAAATTGAATCAATTTACATGATGCTT 
GGAGAAATCTATCCAAATGCACCAGACTTCATGTCACCACATTTTGAGGGGGACTGCTCT 
GAGGGGGAATAAAATCATTATTTATTCTTTATTAGTTATTAGCAGGATTTGTCGGGCATA 
AATGCCCGACCTACAAATTCAATTTTTTCAAACCTCTGCCAAATATTTTCATCTTTGCAA 
55 GGCTGTCTGAAAACCC7\AACCCCATTTTCAGACGGCCTTTTTTCGCTAAAATCCCCATAC 
CGTTCAATCCGAAAACACAGGAGAATCATCATGGAAGTTACCATCTCCGCCATCATCAAT 
GGCGAATTTGCCGACCAATACGGCAAGCGCGGTAGTCAGTTTAATGAAAACGGGATGCTG 
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ATTTAATTCTATTTCGTTTGAAACTACCAATAACCTGCCTCCATCATAAAACTAAAAGCA 
AGCCGTAGCCTGCATTCCCACAAACCGCGTGCGTTGCCATGTCACACACCCTACCTGCGG 
GCGACGCAAACCTTAAGAGACCTTTGCAAAATTCCCCAAAATCCCCTAAATTCCCACCAA 
GACATTTAGGGGATTTCTCATGAGCACCTTCTTTCAACAAACCGCCCAAGCCATGATTGC 
5 CAAACACATCGACCGCTTCCCGCTATTGAAGTTGGACCGGGTGATTGATTGGCAGCTGAT 
CGAACAATACCTGAACCGTCAAAAAACCCGTTACCTTAGAGACCACCGCGGCCGTCCTGC 
CTATCCCCTGCTGTCCATGTTCAAAGCCGTCCTGCTCGGACAATGGCACAGCCTCTCCGA 
TCCCGAACTCGAACACAGCCTCATTACCCGCATCGATTTCAACCTGTTTTGCCGTTTTGA 
CGAACTGAGCATCCCCGATTACAGCACCTTATGCCGCTACCGCAACCGGCTGGCGCAAGA 
1 0 CAATACCCTGTCTGAACTGTTGGAACTGATTAACCGCCAACTGACCGAAAAAGGTTTAAA 
AATAGAGAAAGCATCCGCTGCCGTCGTTGACGCCACCATTATTCAGACCGCCGGCAGCAA 
ACAGCGTCAGGCCATAGAAGTTGACGAAGAAGGACAAATCAGCGGTCAAACCACACCGAG 
TAAGGACAGCGATGCCCGTTGGATAAAGAAAAACGGCCTCTACAAACTCGGTTACAAACA 
ACATACCCGTACCGATGCAGAAGGCTATATCGAGAAACTGCACATTACCCCCGCCAATGC 
1 5 CCATGAGTGCAAACACCTGTCGCCGTTGTTGGAAGGTCTGCCCAAAGGTACGACCGTCTA 
TGCCGACAAAGGCTATGACAGTGCGGAAAACCGGCAACATCTGGAAGAACATCAGTTGCA 
GGACGGCATTATGCGCAAAGCCTGCCGCAACCGCCCGCTGTCGGAAGTGCAAACCAAGCG 
TAACCGATATTTGTCGAAGACCCGTTATGTGGTCGAACAAAGCTTCGGTACGCTGCACCG 
TAAATTCCGCTATGCCCGGGCAGCCTATTTCGGACTGATTAAAGTGAGTGCGCAAAGCCA 
20 TCTGAAGGCGATGTGTTTGAACCTGTTGAAAGCCGCCAACAGGCTAAGTGCGCCCGCTGC 
CGCCTAAAAGGCAGCCCGGATGCCTGATTATCGGGTGTCCGGGGAGGATTAAGGGGGTGT 
TTGGGTAAAATTAGGCGGTATTTGGGGCGAAAACAGCCGAAAACCTGTGTTGGGATTTCG 
GTTGTCGTGAGGGAAAGGAATTTTGCAAAGGTCTCCAGCAGTTTGCGCATACATGCCGTA 
ACGGCAACCTTATACGGCTTACCCTCGGACAGCGGGCGTTGGTGGAAATCCCGAATAAGC 
25 GGTTCAAAACGTGTCGCTGCCACGGTAGCCATATACAGTGCCTTAAGCACCGCAGACCTT 
CCGCCAAAGCAGCGGCTTTTGAATTTGGCTTCCCCGCTCTTCCTCGGGTGCGGGGCAATG 
CCGACCAAACTCGCTATCCGTTTGTGCGACAGCCGCCCCAATTCAGGTAGCATCGCCATC 
AGCGTAGCCGTCGTTATCGAACCGATGCCTTTGATTTGCTCCGCCACTTGGGCTTTGCCG 
TCAAAATGCGTGTGGGTGTGGTCGTCGATTTGTTTGTCCGATTCGTCAATCAGCCGGTCA 
30 AAATGGGCAATCAGTTGTTTGACGCTTCCGACTTGCGTTTCGTGAACCTGATGCAGACGG 
TTTTTCTCGGCAGTCCGCATATCCGCCGATTGGTTGCGGCGGTTAACCAAGGCTTCCAAC 
ACTTCTTCCGCTTCTGTGGGCGGGTGGTAGGGCATGGTTTGCCAATCTTCTTTCTGTGCC 
TTCATCTGTGCGAAGAAGGCAGGCATTTTGGCATCTTTGGCGTCGGTTTTGGTCAGCGAC 
TGCGATTGGGCAAACTGATGCGTCTGACGCGGGTTGGCGATAATCACGGCTATGCCTGCT 
35 CGGTGGATGGCTTTGGCGGCGGGGATTTCGAGACCTCCGGTACTTTCCGTCACGACGAGG 
GCGACCTTGTGTTTTTTAAGGTATTCGATAGTATGGGCGATACCTTTGGGGTTGTTGGTT 
TCGGTTTTGGTTTTAGACAAAGACGAAACGGCGATGACGAAGTTTCGTTTGGCGATGTCG 
ATATAGTGAATTAACAAAAATCAGGACAAGGCGGCGAGCCGCAGACAGTACGGATAGTAC 
GGAACCGATTCACTTGGTGCTTCAGCACCTTAGAGAATCGTTCTCTTCGAGCTAAGGCGA 
40 GGCTiACGTCGTACTGGTTTTTGTTAATTCACTATATCTGTGCGTTACGACGGCATGCCGT 
CTGAAGGGTGTTTATGTCTGCATCTAAGAAATTTCCGATTCCTTTGAGCTATTTCAGCAT 
CGCGCTGGGCTTGTTTGCCTTGGGGCTGTCGTGGCGTTACGGCGCGTCTGTCGGGCTGCT 
GCCCGCCTTGGCCGCCGAATCGCTGCTTGCGGCGGCTTCGGTCGTCTGGCTCTTGCTGGT 
GGCGGCATACCTGATCAAAATGTTTGCGTACCGAAACGATTTTTTGTCTGATTTACGCGA 
45 CTTGGTGCAATGCTGCTTCATCAGCGCGATTCCGATTACCGCTATGCTGGAGGGACTCGC 
GCTGAAGCCCTATCAGGCAGGCGCGGCGGCAGTCCTGATTTATGTCGGCGTTGCCGGACA 
GTTGGCTTTTTCGATGTATCGGGCGGCCGGTCTGTGGCGCGGCCTGCATTCCTTGGAGGC 
GACGACGCCGATTATTTATCTGCCTACGGTTGCGACAAACTTTGTCAGCGCGTCATCTCT 
GGCGGCGTTGGGGCATCATGATTATGCAGCTTTGTTTTTCGGCGCGGGTATGTTTTCCTG 
50 GCTGAGCTTGGAAGCCTCCATCTTGGGCAGGCTGCGCACGGCGGCACCGGTCGGCACGGC 
GGCGCGCGGCGTGGTCGGCATCCAGCTTGCGCCCGCCTTTGTCGGCTGCGGCGCGTATTT 
TGCCGTCGGCGGTAAAGTCGACGGTTTTGCGTTGGCATTAATCGGCTACGGCTGCCTGCA 
GCTTTTGTTCTTGCTGCGCCTGACCCGCTGGTTTTGGGAAGGTGGTTTTACGATGAGCTT 
TTGGGGATTTTCATTCGGTTTCGCGGCAATGGCAGGATGCGGTCTGCATCTGGCGGCTTC 
55 CGGCGTATTGTCGGGCTTGGGGCTGACGCTTGCCACCGCCGGATCGGCAGGCGTGGCGCT 
GCTGCTTGTCGGTACGCTGCACCGGATAGCGACGGGGCGTTTCTTGGTACGCAGCTGATG 
CGTTTTGCCGCCTTGTCAAAAATGCCGTCTGAAACGCTGGGATTCAGACGGCATTTTTTA 
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TTTCACACCCTTACAGGTAGAATTTTTCGATGACTTTCAAATTGTCGTCCAATTTGTACA 
CCAACGGCTGACCGGTCGGGATTTCCAAGCCCATAATGTCTTCGTCGGAAATGCCCTCGA 
TGTGTTTTGCCAGCGCGCGCAGGGAGTTGCCGTGCGCCGCCACCAAGACGCGTTTGCCGC 
TCAAAATCGCGGGGGCGATTTGGTCTTCCCAAAACGGCAATACGCGCTCCAGCGTTACTT 
5 TCAGGTTTTCGCCGTCGGGTACGACATCGGCAGGCAGATGGGCATAGCGGCGGTCTTTGT 
GTGCGGAAAACTCATCGTCTTTGTCCAAAAGCGGCGGCAGGGTGTCGTAGCTGCGCCGCC 
AGATGCGGACTTGCTCGTCGCCGTATTGTTCGGCGGTTTGTTTTTTGTCCAGGCCTTGCA 
GTTGGCCGTAGTGGCGTTCGTTCAGCCGCCACGTTTTGATTTGCGGTACGAACAGTTGGT 
CGGATTCTTCCAAAACGATGTTGCAGGTCTTAATCGCGCGGGTCAGGACGGATGTGAAGG 
10 CGATGTCGAACTCATAGCCGTTTTCTTTCAGTTTCTTGCCGGCGGCGGCAGCCTCGGCAA 
GCCCCTGCTCGCTCAGCTTCACGTCGCGCCAGCCTGTAAACAGGTTTTTCGCGTTCCATT 
CGCTTTGTCCGTGGCGGATAAATACCAGTTCCATATCGTCTCCAATGTGTGAAAGTGGGA 
AAGCCTTATTTATAACATATTTTCACATTTCCCGTATTTGATTCAGATTCAGACACGCGC 
CCACTATGGTTTGCCGTTTTGATTTACAATAATGTCCTTTGCTTTACATTCCGCATACAC 
1 5 AATGAATACGCAAGCGCACGCCCCACATACCGATTCCAATACGCTGATGCTCGGCCGATA 
CGCCGAACGCGCCTATCTCGAATACGCCATGAGCGTGGTCAAAGGCCGCGCGCTGCCTGA 
AGTTTCAGACGGCCAGAAGCCCGTGCAGCGGCGCATTTTGTTTGCCATGCGCGATATGGG 
TTTGACGGCGGGGGCGAAGCCGGTGAAATCGGCGCGCGTGGTCGGCGAGATTTTGGGTAA 
ATACCACCCGCACGGCGACAGTTCCGCCTATGAGGCGATGGTGCGGATGGCGCAGGATTT 

20 TACCTTGCGCTATCCCTTAATCGACGGCATCGGCAACTTCGGCTCGCGCGACGGCGACGG 
GGCGGCGGCGATGCGTTACACCGAAGCGCGGCTGACGCCGATTGCGGAATTGCTGTTGTC 
CGAAATCAATCAGGGGACGGTGGATTTTGTGCCGAACTACGACGGCGCGTTTGACGAACC 
GCTGCACCTGCCCGCCCGCCTGCCTATGGTGTTGCTCAACGGCGCGTCAGGCATTGCGGT 
GGGCATGGCGACCGAGATTCCGCCGCACAATTTGAACGAAGTGACGCAGGCGGCGATTGC 

25 GTTGTTGAAAAAGCCGACGCTGGAAACCGCCGACCTGATGCAATATATTCCTGCCCCCGA 
TTTTGCCGGCGGCGGTCAAATCATCACGCCGGCGGACGAATTGCGCCGGATTTATGAAAC 
CGGCAAGGGCAGCGTGCGCGTGCGTGCGCGTTATGAAATCGAAAAATTGGCGCGCGGACA 
GTGGCGCGTCATCGTAACCGAGCTGCCGCCGAACGCCAATTCCGCCAAAATCCTTGCCGA 
AATCGAAGAGCAAACCAACCCGAAACCGAAAGCGGGTAAGAAACAGCTCAACCAAGACCA 

30 GCTCAATACCAAAAAGCTGATGCTGGATTTAATCGACCGCGTGCGCGACGAGTCCGACGG 
CGAACATCCCGTGCGACTGGTATTCGAGCCGAAATCCAGCCGCATCGATACCGATACCTT 
CATCAACACGCTGATGGCGCAAACTTCGCTGGAAGGCAATGTGTCGATGAACTTGGTGAT 
GATGGGTTTGGACAACCGCCCCGCGCAGAAAAACCTGAAAACGATTTTGCAGGAATGGCT 
GGATTTCCGCACCGTAACCGTAACACGCCGTCTGAAATTCCGTTTGAACCAAGTGGAAAA 

35 ACGGCTGCACATCCTCGAAGGCCGTCTGAAAGTCTTTCTGCACATCGACGAAGTGATTAA 
AGTCATCCGCGAATCAGACGACCCGAAAGCCGATTTGATGGCGGCGTTCGGGCTGACCGA 
AATCCAAGCCGAAGACATTTTGGAAATCCGCCTGCGCCAGTTGGCGCGTTTGGAGGGTTT 
CAAACTCGAAAAAGAATTGAACGAGTTGCGCGAGGAACAAGGCCGTCTGAACATCCTTTT 
GAGCGACGAAAACGAAAAACGCAAGCTGATTGTCAAAGAGATGCAGGCGGATATGAAACA 

40 ATACGGCGACGCGCGACGCACGCTGGTGGAAGAGGCCGGACGCGCCGTGCTGACGCAGAC 
CACCGCCGACGAACCCATCACGCTGATCCTGTCGGAAAAAGGCTGGATACGCAGCCGCGC 
CGGACACAATCTCGATTTGAGCCAAACCGCGTTCAAAGAAGGCGACTGCCTCAAACAAAC 
CCTCGAAGGCAGAACGGTTTTACCCGTCGTCATCCTCGATTCATCGGGCAGAACCTACAC 
GCTCGATGCCGCCGAAATCCCCGGAGGGCGCGGCGACGGCGTACCGGTTTCCTCCTTAAT 

45 CGAGCTGCAAAACGGCGCGAAACCCGTTGCGATGTTGACAGGATTGCCGGAACAACATTA 
TTTATTATCAAGCAGCAGCGGCTATGGCTTCATCACCAAGCTGGGCGATATGGTCGGGCG 
CGTGAAAGCGGGCAAAGTGGTGATGACCGCAGACAGCGGCGAAACCGTTTTGCCGCCGGT 
TGCCGTCTATGCCTCCTCGTTCATCAACCCCGACTGCAAAATCATTGCCGCCACCAGTCA 
AAACCGCGCCCTCGCCTTCCCCATCGGCGAATTGAAAATTATGGCGAAAGGCAAAGGGCT 

50 GCAAATCATCGGATTAAACGCCGGCGAATCGATGACGCATACCGCCGTTTCTTCCGAGCT 
GGAAATCCTGATTGAAAGCGAAGGCAGGCGCGGCGCGGCGCACAAAGACCGCATCCCCAT 
CTCCCTGCTTGAGGCAAAACGCGGCAAAAAAGGCAGACTATTGCCCATATCGGGCAGCCT 
GAAACAGCTTTCTTCCCCTAAATAAACCCGGTTCCGCACATATTATGGTGATTTCCAACC 
CCCGCGAACTTGAAAAACTCAAAGACCGGATTCCCAATCTGATCAACATCATCCGCGTCG 

55 CCATCGTTTTTCCGCTGATGATTATGCACATCCTCGGGCTGGAAACCGGCAGCCGTGCGA 
ACCTGCACGCTTCGTGGACGGCGTGGGCGTTTTATGTTTGGCTCGCCATTGCCTGCTGGC 
TGATTTTCTTTTCCATTATCCATCCGCATTGGCAATGGCAGTCGCTGAAAATGCCGCGTT 
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TCAGCGCGGTAGCGGACATCACGATGflTCGGCGTGCTGACCTACCTGrTCGGCGGCATCG 
ATTCCGGCTTCGGCATCCTGATCCTGCCCTTCGTCGTCTGCTCCTGCCTGCTCAGCTACG 
GGCGCTACCCCCTGCTCTATTCCAGCTACGCCGCCATCCTGCTGATATTCAACGCCATTG 
CCGACGGCGATATCGGCAAATACCCGCTCATATCGGATGCCCGAACC3CCTCGGCAACCT 
5 TCATCCTTGTCGCCGCCTCCTATCTTTCCGCCATCTTCACCTCACTGTCGGTCAAATACA 
TCGACCGTGCCGGAAAACTCGCCTACGACAGCCATATCGCCTACCACCGCATCAAAGGCT 
TGAGCCAAACCGTACTCGAACGCGTTCAGGAAGCTGTCGTCGTCATCAATGCCGAAGGGC 
TGGCGGTGCTGTTCAACCGGAAGGCGAAAGACCTTTTCCCCGCGCTCGAAATCGGACGGC 
GCGCCGGTCTGTCCGATTCTGCCGCCGAACTGTGGGATCAAGCCTCTCCGCACACTTTCG 
10 AATACGTCCTCGGCACACCCGGCCTGAACGCCGGCATCCGCGCCGTTCCGGTCAACAAAG 
GGTCGGACAAGCTGCTCATCCTCTACATCCGCCCGCAAAGCGAAATTCAGGCAGAAGCCC 
TGTCCGTCAAACTTGCCGCGCTCGGACAACTGACCGCCAACCTCGCCCACGAAATCCGCA 
ACCCGATGTCCGCCATCCGCCACGCCAACGACCTGCTGCGCGAAAATATGGAAGCGGGGG 
CGGCAGATCCGTTCAACGCCAAATTGTGCAAAATCATCGACGGCAACATCTGCCGCATCG 
1 5 ACAAAATGCTCGAAGACATTTCCTCGCTCAACAAGCGCAACAAAACCGAACGCGAAACCA 
TCGGCCTGATACCGTTTTGGGAAGAATTCAAACAAGAGTTCCTGCTCGGCCATCCCGATG 
CCGCCGACTGCATCCGTCCGGACATTCAAGGCGGCAGCCCGACCGCCTATTTCGATCCCG 
CCCACCTGCGGCAAATTATGTGGAACCTCGCCAACAACGCGTGGCGGCACAGCCGCAAAC 
AGCCCGGCTCGATTTCCGTCACCATCCGCCCCGCGCAAAAAAACACCGTCTGTATCCTCT 
20 TTGCCGACCGCCCGAAGTGCAGGAACACCTGTTCGAACCCTTTTACACCACGGCGGAAAA 
CGGCACCGGCCTCGGGCTGTATGTCGCCCGCGAACTGGCGCACGCCAATTTCGGCGATTT 
GACCTACCTACCGGAAGCCAAATGTTTCGAACTCACATTACCGGAAAAAACCAATGACTG 
AACTGCAACACCCCGTCCTCGTCGTCGATGACGAAACCGACATTCTCGACCTGATGGAAA 
TGACCCTGATGAAAATGGGCTTGCGCGTCCATACCGCGTCAGGCGTTGCCGAAGCCAAAA 
25 ACAAGCTCGACAGCCAACGCTATTCGCTCGTCCTGACCGATATGCG7ATGCCGGACGGCT 
CGGGGCTGGAAGTCGTCCAACACATCAACAGCCGCCTGCTCGATACGCCGGTTGCCGTCA 
TCACCGCCTTCGGCAACGCCGATCAGGCACAGGAAGCGTTGCGTTGCGGCGCGTTCGACC 
CCGATACCATGCAGATACAGGACTATCTCGACCAAATCGAACGCGACATCATCGAACAAA 
CCCTCAAACAAACCGAAGGCAACCGCACGCAGGCCGCCAAACGCTTGGGCATCAGCTTCC 
30 GTTCCATGCGCTACCGTATGGAACGCCTCAACATCGGCTGACGACAAAACGGCATCCGCA 
CCATCTCCGCCCACCCGAAAAAATGCCGTCTGAAACGGCACGGGAAAGCGGGTTCGCCCC 
ACGCCCGAACGGACACAAAACACCATGACCGACATCCTTATTGACAACACCGCCACCGAA 
ACCGTCCGCACCCTGATACGGGCATTCCCCCTTGTGCCCGTTTCCCAACCGCCCGAACAA 
GGCAGTTACCTCCTTGCCGAACACGATACCGTCAGCCTCAGGCTTGTCGGGGAAAAAAGC 
35 AGCGTCATCGTCGATTTTGCCTCCGGCGCGGCACAATACCGGCGCACAAAAGGCGGGGGC 
GAACTCATCGCCAAAGCCGTCAACCACACCGCGCACCCCACCGTTTGGGACGCAACCGCA 
GGATTGGGGCGCGACAGCTTCGTCCTCGCCTCGCTCGGGCTGGCCG7TACCGCCTTCGAG 
CAACATCCCGCCGTCGCCTGCCTGCTTTCAGACGGCATCCGCCGCGCCCTCCTCAATCCC 
GAAACGCAAAACACCGCCGCGCACATCAACCTCCATTTCGGCAACGCCGCCGAACAAATG 
40 CCCGCACTTGTCCAAACACAAGGCAAACCCGACATCGTCTATCTCGACCCCATGTATCCC 
GAACGCCGCAAAAGTGCCGCCGTTAAAAAAGAAATGACCTACTTCCACCGGCTCGTCGGC 
GAAGCGCAAGATGAAGCGGCACTCCTGCATACCGCACGCCAAACAGCAAAAAAACGCGTC 
GTCGTCAAACGCCCCCGCCTCGGCGAACACCTTGCCGGACAAGACCCTGCCTACCAATAC 
ACAGGCAAAAGCACCCGCTTCGACGTTTACCTGCCCTACGGGACGGACAAGGGATAACGC 
45 CCATAAAACAAGACACCGAAAAATTTGCCGTTCTTATGCA.AACGAGA_AACCGGTTTTTGC 
GTTTCGACTGTTTTGGATAAGTCATCACACCTTAAAGTTTGTCATTCGCACAGAAGTGGG 
AATCCGATTCATTCAGTTTTATAGTGGTTTAAATTTAAACCACTATAGTTGTTTTCGAGT 
TTCAGGCAACTTCCAAACCGTCATTCCCACGGAAGTGGGAATCTAGAAATGAAAGGCAAC 
AGGAATTTATCGTAAATGACTGAAACCGAACGGACTAGATTCCCGCCTACGCGGGAATGA 
50 CGGGGCGGGCAGATGCCGTCTGAAATTCCGTCATTCCCGTGAAAACGGGAATCTAGAACT 
TCTGATTTTTCAGACGACTTTTGAACATTGCCGCCACCCAATGATCTGGATTCCCACCTG 
CGCGGGAATGACGAGGTTTCAGGTTGCTGTTTTTAAGTTGCTGTTTCGGGTTGCTGTTTT 
TTATGGAAATGACAAGGTTTTAGATTGCGAGAATTTATCCGCTCCTCCGTCATTCCCACG 
GAAGTGGGAATCCAGAAATGAAAAGCAACAGGAATTTATCATAAATGACCGAAACCGAAC 
55 GGACTAGATTTCCGACTGCGCGGGAATGACGGGGCGGGAGGATGCCGTCTGAAATTCCGT 
CATTCCCGTGAAAACGGGAATCTAGAACTTCTGATTTTTCAGACGACTTTTGAACATTGC 
CGCTACCCAATGATTTGGATTCCCGCCTGCGCGGGAATGACGATGTAAAATTATCCGGGA 



wo 00/22430 



PCT/US99/23573 



TTCAAAAAGACAGGCTTTCACATCCGTGGGAATGACTGCGGAAAGATGATTTTTATAGTG 
GATTAACAAAAATCAGGACAAGGCGACGAAGCCGCAGACAGTACAAATAGTACGGCAAGG 
CGAGGCAACGCCGTACTGGTTTTTGTTAATCCACTATATTTTGTCATAAAAATCCGCACC 
TTAATCAGTTGGCGGTTAAATCAAACTTTTAGGGTGCAGATTACTTTTTATGATTTCAGA 
5 CAGCATTTTGACAGGCGGCAGCCTATTTCGGCAATACCAAAAACTTAATCAGCAGTTCTT 
TGAATACAAAACCGAACACGCCCAAGCCCAAAACCAAAAACAAAATGGCGATGCCGAATT 
TGCCTGCTTTGGACTCCTTGCCCAAATTCCAAACGATAAAACCCAAAAAAATAATCAAGC 
CGGTCAGGCAGATTTTCAACGCCCAATCGGCAAAAACCGCTTCATCCATATTTTTTTCCT 
ATTGTTGATGTGTATGCCATATAAGATAAGGGTTTCAGACGGCATCTGCTGTCCAATGCC 
10 GTCTGAAACACGCAATCAGCGTGCGAGTGCCTGTTTCAAATCGTCAATCAAATCGCCAAC 
ATATTCCAAACCGACCGACAGGCGCACCAATCCGGGGCGGATGTTGGCGGCGAGTTTTTC 
TTCGGGCTGCATCCTGCCGTGCGTGGTTGTCCACGGGTGGGTAATGGTCGAGCGCACGTC 
ACCGAGGTTGGCGGTGCGGGAAAAGAGTTCCACGCCGTCCACAACTTTCCACGCCGCTTC 
TTGATCGGCAACTTCAAAGCCGATGACGATGCCGCCGCCGTTTTGCTGTTTGCGGATAAG 
1 5 CGCCGCCTGAGGATGGTCGGACAATCCGGTGTAGTACACGGCTTGAACCTGCGGCTGCGC 
TTGCAGCCATTGTGCGATTTTCAGGGCGTTGTCGAACTGTTTTTCCATACGCAGCGACAG 
GGTTTCCACGCCGCTCAACAACTGCCACGCATTAAACGGCGACATCGCCAGCCCGCAAGA 
GTTGCAATACATGGCGACCTGCGCCAACAACTCTTCCGAACCCGCCAACACGCCGCCCAT 
CACACGCCCGTGTCCGTCTATGGCTTTGGTCGCGGAGGAAACGGAAATATCCGCACCGTG 

20 TTTCAAAGGCTGCGAGCCGACGGGCGACAGCAGGCTGTTGTCCACCACCAAGAGCGCGCC 
GATGCCGTGCGCCAATTCCGCCAAGGCTTCCAAGTCGGCCACTTCGCCTAAGGGGTTGGA 
CGGCGTTTCCAAAAACAGCAGTTTGGTATTGGCTTTGACGGCGGCTTTCCATTCGTTTAT 
ATCAGTCGGCGACACGTGGCTCACTTCGATGCCGAATTTGGCAACGATGTTATTGATAAA 
GCCGACGGTCGTGCCGAACAGGCTGCGGCTGGAAATCACATGGTCGCCCGCCTGCAAAAA 

25 GGTGAAAAACGCCGCCTGAATCGCAGACATACCCGCCGAAGTGGCGACCGCGCGTTCCGC 
ACCTTCCAAAGCGGCGATGCGTTTTTCAAAGGCGGCTGTGGTCGGGTTGGCGGTACGGGT 
ATAAGTGT^CCCTTTGATTTTTTTTGAAAACAAATCGGCAGCGTGTTGGGCGTTGTCCCA 
CATGAAGCTGCTGGTCAGAAACAATGCCTGATTGTGTTCGCGGTATTCGGTTTGTTCTTT 
GCCGCCGCGTATGGCGAGCGTTTGCGGATGGAGTTTTTTGCTCATCGGTGATTCCTCGGT 

30 TTTGTCCGTTCGGCAACGGAGCGTGCGCCCGTTGTTTAATTTGTTAATATTTTGCGCCTG 
TTCTATGATGCTTTCAAGTCGGATGAGAATGCAAATGCCGTCTGTiAACGGCTTTCAGACG 
GCATGGCAATCAGCGTTTGTATTTTAACTCGTACTTGATGTCGTTGAGGATTTTGCGGAC 
ATCGTGTTCCAACACGTCTTCGACTACCGCCCCCGCCTGCTCGTGCAGCATCTGCTGGAG 
CTGATAGGTGAAAACCGCCATCTGCTTTTGCACCGCCGTTCGGATGATGCCGTTGACGGT 

35 ATCGGTCAGATGCGGGCGCAGGCGTTTGATCAGCCGTTCGGTCAGCTCCTGTTCGGACAG 
GCAGAACACTTCGCGCCGGTTGACGGCTTTCGGGTTCAGGATATTGATTTGGACGGGCAT 
CAACGTTTCTTCCGCATCGTTTTCCCCGTTTTCCGAAACCGCCGGCTCATTCGTGCCGGA 
TTCTGCCTCGTCGGCGTTTTCCCCGCTTTCAATCTGTCCGGTTTCAAATTCGACACTGTC 
TTTTTTGGTATCAAACCGGATTCTCCGCCGCGATTCGATGTGTTTTTCCGAAACCGACAT 

40 TTGCAGGGAAGCCTGCGCGTTGAGCCAGTTTTCCTGAAGGACGATCATCGGGTCGGTTTC 
GACTTCCTCGCCGCAATCGGCAACGGCGGCATTGTGTTCCTCCTGCCATTTTTTCAGATA 
CGCCTTCAACACACGGGCTCGGCTCTCATCGTCCAGTTTCGGCACAGGCGCGTCCGTTCC 
GGTTTCAGAGGGGCGGGACAGCGGCGCGTAAGTCGGCACTGCCTTCATACGGCGCGTCTG 
ACGCAGGTTTTCCAAACGTTTTTCCCAATTCGGCTCTTTATTCGCATCCATTTTCGGCTT 

45 CCGGTTCTTAATCTTTGCAAGCAGACAAACCCGCGCCCAAAGCGCGGTTTGATATAATGG 
CGCATTTTAACAGATTCGCGAGGATACATCATGGGCAGCATCGAACAGCGTTTGGAATAT 
CTGGAAGAGGCGAACGACGTGCTGCGTATGCAGAACCACGTCCTGTCCACCGCATTCAAA 
GCCTTAATCCGCGCCCTTCCCGCCGAAACCGCCGAAATCGCGGTCGAGTCGATTCAGCTT 
GCTTTTGAGGACGCCTTGGCAGAATTGAGCTATGAGGACAGCCCGCATACGGATTTGTTC 

50 CACGACGTTACTTATGCGTTTTTCCGTGAAAAAGAACGTTAATTTTATGTTAAACTGATT 
TTTTAGGCTTTTTGATTACCGATU^GGAATTTTGATGTVATATGAAAAAATGGATTGCCGCC 
GCCCTTGCCTGTTCCGCGCTCGCGCTGTCTGCCTGCGGCGGTCAGGGCAAAGATACCGCC 
GCGCCTGCCGCCAACCCCGACAAAGTGTACCGCGTGGCTTCCAACGCCGAGTTTGCCCCC 
TTTGAATCTTTAGACTCGAAAGGCAATGTCGAAGGTTTCGATGTGGATTTGATGAACGCG 

55 ATGGCGAAGGCGGGCAATTTTAAAATCGAATTCAAACACCAGCCGTGGGACAGCCTTTTC 
CCCGCCTTAAACAACGGCGATGCGGACGTTGTGATGTCGGGCGTAACCATTACCGACGAC 
CGCAAACAGTCTATGGACTTCAGCGACCCGTATTTTGAAATCACCCAAGTCGTCCTCGTT 
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CCGAAAGGCAAAAAAGTATCTTCTTCCGAAGATTTGAAAAACATGAACAAAGTCGGCGTG 
GTAACCGGCTACACGGGCGATTTCTCCGTATCCAAACTCTTGGGCAACGACAATCCGAAA 
ATCGCGCGCTTTGAAAACGTTCCCCTGATTATCAAAGAACTGGAAAACGGCGGCTTGGAT 
TCCGTGGTCAGCGACAGCGCGGTCATCGCCAATTATGTGAAAAACAATCCGGCCAAAGGG 
5 ATGGACTTCGTTACCCTGCCCGACTTCACCACCGAACACTACGGCATCGCGGTACGCAAA 
GGCGACGAAGCAACCGTCAAAATGCTG?ACGATGCGTTGGAAAAAGTACGCGAAAGCGGC 
GAATACGACAAGATTTACGCCA.JVATATTTTGCAAAAGAAGACGGACAGGCCGCAAAATAA 
GCCCGCCCGTCCGAACACAATGCCGTCTGAAGCCCTTTCAGACGGCATTGTTCATCAATC 
GGCCTACAATGAACTGCCTGCTGATTTCTCCCTACCGCAAAGCAACAGGCAAAGATTACA 
10 AATATCAAAATCCGAGTAAAACAGTATTTTATTAAAACAAATTGATAATCAAGAGATTAG 
AATTATGTATTGTCTTTACCGTACAAACGCTGGCACTATTTCAACCTGATAAAAAACAGC 
CTTCAAAAAGGTTGTTTAAAACAGCAGCAGACACTTACCGCCACAACCTTGAAAAGGAAC 
ACAATCATGACCGTCATCAAACAGGAAGACTTTATCCAAAGCATTTGCGATGCCTTCCAA 
TTCATCAGCTACTATCATCCCAAAGACTACATCGACGCGCTTTATAAGGCGTGGCAGAAG 
1 5 GAAGAAAATCCTGCCGCCAAAGACGCGATGACGCAGATTTTGGTCAACAGCCGTATGTGT 
GCGGAAAACAACCGCCCCATCTGCC.?\AGACACAGGTATCGCAACCGTCTTCCTCAAAGTC 
GGTATGAACGTCCAATGGGATGCGGACATGAGCGTGGAAGAGATGGTTAACGAAGGCGTA 
CGCCGCGCCTACACTTGGGAAGGCAATACGCTGCGCGCTTCCGTCCTCGCCGATCCGGCC 
GGCAAACGCCAAAACACCAAAGACAACACCCCCGCCGTCATCCATATGAGCATCGTGCCG 
20 GGCGGTAAAGTCGAAGTAACCTGCGCGGCAAAAGGCGGCGGCTCTGAAAACAAATCCAAA 
CTCGCCATGCTCAATCCTTCCGACAACATCGTCGATTGGGTATTGAAfliACCATCCCGACC 
ATGGGCGCGGGCTGGTGTCCTCCCGGCATCTTGGGTATCGGCATCGGCGGCACGCCCGAA 
AAAGCCGTGCTGATGGCAAAAGAGTCCCTGATGAGCCACATCGACATTCAAGAATTGCAG 
GAAAAGGCCGCGTCCGGCGCGGAATTGTCCACCACCGAAGCCCTGCGCCTCGAACTCTTT 
25 GAAAAAGTCAACGCGCTGGGCATCGGCGCACAAGGCTTGGGCGGACTGACCACCGTGTTG 
GACGTGAAAATCCTCGATTATCCGACCCACGCCGCCTCCAAACCGATTGCCATGATTCCG 
AACTGCGCCGCCACCCGCCACGTCGAATTTGAATTGGACGGCTCAGGCCCTGTCGAACTC 
ACGCCGCCGCGCGTCGAAGACTGGCCCGATTTGACTTACAGCCCCGACAACGGCAAACGC 
GTCGATGTCGACAAGCTGACCAAAGAAGAAGTGGCAAGCTGGAAAACCGGCGACGTATTG 
30 CTGTTGAACGGCAAAATCCTCACCGGCCGCGATGCCGCACACAAACGCCTCGTCGATATG 
CTCAACAAAGGCGAAGAATTGCCCGTCGATTTCACCAACCGCCTGATTTACTACGTCGGC 
CCCGTCGATCCGGTCGGCGATGAAGTCGTCGGTCCGGCAGGTCCGACCACAGCCACCCGC 
ATGGACAAATTCACCCGCCAAATGCTCGAACAAACCGACCTCTTGGGCATGATCGGCAAA 
TCCGAGCGCGGCGTGGCCACCTGCGAAGCCATCGCCGACAACAAAGCCGTGTACCTCATG 
35 GCAGTCGGCGGCGCGGCGTATCTCGTGGCAAAAGCCATCAAATCTTCCAAAGTCTTGGCG 
TTCCCCGAATTGGGCATGGAAGCCATTTACGAATTTGAAGTCAAAGACATGCCCGTAACC 
GTCGCCGTAGATAGCAAAGGCGAATCCATCCACGCCACCGCCCCGCGCAAATGGCAGGCG 
AAAATCGGCATCATCCCCGTCGAATCTTGAGGCGCCATGCCGTCTGAACACAAAATCTGC 
CTTCAGACGGCATTTCCGCCCCCGGTTGCGGTACAATCCACCATTTCATCACTCGGCGAC 
40 CCACACCGTGAAAATCCTCATTTTAGGCAACGGACAGGTAGGTTCTACCGTCGCACAAAA 
CCTTGCCGCCATACCCAACAACGACGTAACCGTTATCGACATCGACGAAAAAGCATTGCA 
GGAAACAGGCAGCCGCCTCGATGTCCAAACCGTTTTCGGCAACGGCGCATCCCCCTTCAC 
ATTAGAACGCGCCGGCGCGGAAGATGCCGACTTGCTGCTCGCGCTCTCCCGCAGCGACGA 
AACCAACATCGTCGCCTGCAAAGTTGCCGCCGACCTGTTCAACATCCCCGGCCGCATCGC 
45 GCGCGTCCGTTCCAGCGAATACCTCGAATACCTCAGCCCCAAGCTCGAAAACAACGAAAA 
CGGCAGCCTTTCCATATTCGGCATAACCGAAACCATCAGCCCCGAACAGCTCGTTACCGA 
ACAGCTTGCCGGCCTGATAGACTGCCCGGGCGCATTGCAGGTTTTACGTTTTGCAGACGA 
CCGCGTGCGGATGGTCATCATACAGGCGCGGCGCGGCGGACTGCTTGTCGGACGCAGCAT 
TGCCGACATCGCCCAAGATTTGCCCGACGGGGCCGACTGCCAAATCTGCGCCGTTTACCG 
50 CAACAACCGCCTCATCGTCCCCGCGCCGCAAACCGTCATCATCGAAGGCGACGAAATCCT 
ATTTGCCGCCGCCGCCGAAAACATCGGCGCGGTCATACCCGAATTGCGCCCCAAAGAAAC 
CAGCACCCGCCGCATCATGATTGCCGGCGGCGGCAACATCGGCTACCGTCTCGCCAAGCA 
GCTCGAACACGCATACAACGTCAAAATCATCGAATGCCGGCCGCGCCGTGCCGAATGGAT 
AGCCGAAAACCTCGACAACACCCTCGTCCTGCAAGGTTCGGCAACCGACGAAACCCTGCT 
55 CGACAACGAATACATCGACGAAATCGACGTATTCTGCGCCCTGACCAACGACGACGAAAG 
CAACATTATGTCCGCCCTTTTGGCGAAAAACCTCGGCGCGAAGCGCGTCATCGGCATCGT 
CTiACCGCTCAAGCTACGTCGATTTGCTCGAAGGCAACAAAATCGACATCGTCGTCTCCCC 
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CCACCTCATCACCATCGGCTCGATACTCGCCCACATCCGGCGCGGCGACATCGTTGCCGT 
CCACCCCATCCGGCGCGGCACGGCGGAAGCCATCGAAGTCGTCGCACACGGCGACAAAAA 
AACTTCCGCCATCATCGGCAGGCGCATCAGCGGCATCAAATGGCCCGAAGGCTGCCACAT 
TGCCGCCGTCGTCCGCGCCGGAACCGGCGAAACCATTATGGGACACCATACCGAAACCGT 
5 CATCCAAGACGGCGACCACATCATCTTTTTCGTCTCGCGCCGGCGCATCCTGAACGAACT 
GGAAAAACTCATCCAGGTCAAAATGGGCTTTTTCGGATAAACCGCCCCATTCCGGACATA 
TTGCCGCCAAGCGGTATGGAAGCGGAAATAATGGTAGGTGGGCTTCAGACGGCATCCGCC 
CTCCCCGTCATTCCCGCGTAAGCGGGCATCCAGACCTTGGGATAGCGGCAATATTCAAAG 
GTTATAAAAGACCCGTCATTCCCGCGCAGGCGGGAATCCAGACCTTGGGATAGCGGCAAT 
10 ATTCAAAGGTTATCTGAAAATTTAGAGGTTCTAGATTCCCGCTTTCGCGGGAATGACGAA 
AAGTTGCGGGAATCCAGAACGTCGGGCAACGGCAATATTCAAAAGCCGTCTGAAAATTTA 
AAAGTTCTAGATTCCCGCTTTCGCGGGAATGACGAAGTTTCAGACGGCATCGCCCGCCTG 
TTTTGATATAGCGGCACCCCCCCGACAAAAAAACAATCCGGAACGCATCTGACCGTTCCG 
GCTTGTTTTCAGGCGAATCCGCCGCATCAGAACATACTGCGCACGCCCATATTGACCTGC 
1 5 CAAGTCTAGCGCATCGTGTGCATCGAAGACCTTTGCGCCTCAAAATAAAGCTGCCTTCCG 
TTGTCGGCATTACCACGCAAAAAAATGAATTGCTTGATATTCCAATGTTTTTTATATGTT 
TTTATATTGTGATGCGATCAGACAAACGCCCCCCTGACATTTGTTTAGACGGCATCGTAT 
TGCTTWiTTTCTATAAGTATGTATAATGTCCGTTTCCACGCGCCCATCGTCTAGAGGCCT 
AGGACACTGCCCTTTCACGGCGGCAACCGGGGTTCGAATCCCCGTGGGCGTGCCAATTCA 
20 AAAACCTGCTTGTTTCAAGCAGGTTTTTTATTATGAGTCGTCATTCCCGCAATTTTTCGT 
CATTCCCGCAAAAGCGGGAATCTAGAGCGTAGGGTTGAAGAAACCGTTTTATCCGATAAG 
TTTCCGTGCCGACAGGTCTGGATTmCCGCCTGCGCGGGAAGGACGGCAGAGGGTGGACGA 
TGCCGTCTGAAGCCTGACAAAGCATTTGATGCCGTCTGAAACTTCGTCATTCCCGCAAAA 
GCGGGAATCTAGAGCGTAGGGTTGAAGAAACCGTTTTATCCGATAAGTTTCCGTGCCGAC 
25 AGGTCTGGATTCCCGCTTTCGTAGGAATGACGGAATTTTAGGTTTCTGTTTTTGTGGAAA 
TGACGAATAAAGCGTGCCGGTTTATGCTCGCCGCAACACGCGGTTCAGACGGCATTGCTC 
TCTTTTTTCATTATCAGTGGGTGTAGCAACTGTATTTTTCACCCCGTCGGGCAAATVATAC 
AGTTGCTACGATGCACCCCGCCGCCCTGCCCTGTGCCTTGTCCTGCAATACGGCATATAA 
TGCACCACAAACCCCCGCGCTGCGGTTTTCAGACGGCATCGCCGTGCTTTTTTACAGGCA 
30 TTAGCCCTTTTTATCGGACGCAATATTAAGGAGGAACAAATGAAAAGCTCTTTTGTGCAA 
ACGCTTACCATCGCCGGTTCGGATTCGGGCGGCGGTGCGGGCATTCAGGCGGATTTGAAA 
ACATTTCAGATGCGCGGCGTGTTCGGAACGTGCGTCATCACCGCCGTTACCGCGCAAAAT 
ACCTTGGGCGTGTCGGCGGTTCATCTCGTCCCGACCGAAACCATCACCGCACAAATCCAA 
GCAATCAGGGAAGACTTCGACATCCGCGCCTACAAAATCGGTATGCTCGGCACGGCGGAA 
35 ATCATCGAATGCGTTGCCGACAAGCTGAAACACTGCAGCTTTGGCAGGCGCGTACTCGAC 
CCTGTGATGATTGCCAAAGGCGGTGCGCCGCTGTTGCAGGATTCCGCCGTTGCGGCACTG 
ACGCGCCTGCTGCTTCCCGATACGGATGTATTGACCCCCAACCTGCCCGAAGCGGAAGCT 
CTGACCGGCGTGCATATTGAAAACCGTAAAGATGCGGAACGTGCGGCAAAAATCCTGCTT 
GATTACGGTGTCAAAAATGTCGTTATCAAAGGCGGACATTTGAACGGCAGCACAAGCGGA 
40 CGCTGCACGGATTGGCTGTTTACACAAAATGAAACGCTGGAATTCGACAGCCCGCGCTTT 
CCGACCGCCCACACGCACGGCACGGGCTGCACGTTTTCCGCCTGCATCACCGCCGAGTTG 
GCAAAAGGCTCGGACGTTTGCGAAGCCGTACAGACTGCCAAGGCCTACATCACGGCGGCA 
ATCTCAAACCCTTTGGAAATCGGCGCAGGACACGGCCCGGTCAATCATTGGGCGTATCGG 
GACTAACCGTAAAAATGCCGTCTGAAACAAAATGTTCAGACGGCATTTTTGAGGATTATT 
45 CAGGCTTTTTCGCCAGCATCGTTACAAATTTAAACCGTATCGGATTGCCGTTTTCGTCTT 
TGGCATGCATAGAACCCAATTCTTCTTTATATTCGACCAGTTCCCAATCCCGATAATAAT 
CCTTCAGCTCGCCCTCTTTAAATTTAAAAGGGAACGGCATCGGACAGGGGAAATCCGCCG 
TATCCATTGCCGATACAATCAAGTTGTACCCGCCCGCCGCCGTATGCGCCTGCATATCGG 
CAATCACGTCGGGTACGCGCTGCGGCATCAGGAACATCAGCACCACTGTTGCCACAATAT 
50 AATCAAACTCGCCCTGCAAGGCGGCGGCGTTCAAATCATATTCCAGCGTGCGGACGTTCA 
AACCCTCCGCCTCTGCCAGCTCCGCCACGTTTGCCAAGGCGGCGGGATTGTGATCGACTG 
CAGTAACTTCAAACCCCTTCAAACCGAGAAACAGCGCGTTGCGCCCCTGTCCGCAGCCCA 
TATCCAACGCCCTGCCCGCCGGTACGGTATCCCGTGCCGCCGCGACCGCAGAATGCGTGG 
CACTCATCCCGTATTTTTTGTGAAAATAGTCTGCCGCCGCGCAATACAGCGACAAACGGA 
55 TTTCGGCATCGTCCGTTTTCGGTTTGACAGAAAACACCTGCTGCGGCGCAAACACACAAT 
CGCCGCCGTCTGCCGACCAAACTTCTGCCGACCCGTCCGGTGCACGAACTTCGACATCGC 
CCTGCAACACATTCAGGCAGACCCACTCCCCTTCCTCAGACGAATAGCCCGACAACAAAA 
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CTTCCGGCAGGTTTTCCACTTTCCATACAGGCATCTGTCCGAAACAAAACAACTCGCCAC 
TTTGACCCACTATCCGCTCCTTCAwATTCAAAAATAAAGTTGCACATTATATGCCTATTT 
TAATCCGCCGCAATCTTTCAGACGGCACGGCGCGCAAACCGCTTATAATCACGCCGGACA 
CCACACAAAGGCACAATAATGAACCAAACCGTTTACCTTTACACCGACGGCGCGTGCAAA 
5 GGCAATCCCGGCGCGGGCGGCTGGGGCGTGTTAATGCGCTACGGTAGCCACGAAAAAGAA 
CTTTTCGGCGGCGAAGCGCAAACCACCAACAACCGCATGGAACTGACTGCCGTCATCGAA 
GGACTGAAATCGCTCAAACGCCGCTGCACCGTCATCATCTGCACCGACTCGCAATACGTC 
AAAAATGGCATGGAAAACTGGATACACGGTTGGAAGCGCAACGGCTGGAAAACCGCCTCC 
AAACAGCCCGTCAAAAACGACGACTTGTGGAAAGAACTCGACGCTCTAGTCGGACGGCAT 
1 0 CAAGTCAGTTGGACTTGGGTGAAAGGACACGCGGGACACGCCGAAAACGAACGCGCCGAC 
GATTTGGCAAACCGTGGCGCAGCGCAGTTTTCCTGACTGCCGCTCCGGCAAAAATGCCGT 
CTGAAACCGCTAATGGGCTTCAGACGGCATCGTCCTCCACCGTCATTCCCGCGCAAGCGG 
GAATCCAAACCGTCGGGCAACGGCAATATTCAAAGATTATCTGAAAGTTTGAAGTTCTAG 
ATTCCCGTTTTCACGGGAATGACGAAAAGTTGCAAGAATGACGGAGTTTCAGGCGGCATC 
1 5 CGACCGCCCCGTCATTCCCGCGAAAGCGGGAATCTAAAAACCCAACGCTGCAAGATTTAT 
CAGAAACAACTGAAACCGAACGGACTGGATTCCCGCCTGCGCGGGAATGACGGGATTTTA 
GTAACCGTAGCAACCGCCTGCGCGACGGCTAAGGGGCTTCAGCAACCGTAGCAACTGCCT 
GTGTGGGAATGACGGACAATGGGCTTCAGACGGCATCTCTTGCCTGCCGCTAAAACAGTT 
TGCCGCACAACTGTTCAAACGCGTCCGATATGTTTCAACACACAGGACGACACATAAAGC 
20 ACCTCCCTATGTGTCGTCCTGATTTGGAAGGGGTTACACCCCCTCCCAAATAAAGTCTGA 
TCCTGCCGCCCTAAAGGGCGGGGTTTCAACCGAAAAGGAAATACGATGAAGTGGTACAAT 
TAGCGGCAATGCGGACAGACAAATTAAACTATAGTGGATTAAATTTAAACCAGTACGGCG 
TTGCCTCGCCTTAGCTCAAAGAGAACGATTCTCTAAGGTGCTGAAGCACCAAGTGAATCG 
GTTCTGTACTATTTGTACTGTCTGCGGCTTCGTTGCCTTGTCCTGATTTTTGTTAATCCG 
25 CTATATCAGAAATTACCCTACCGTTTTTTAAACACTTTCAGGAATAAGGAAAAATGACCG 
CCCAACCCTGCCCCATCTGCACGGCGCAAAATGAAGACGTTTTGCTGCAAACCCCCAACC 
TCCGCGTCATCGCCGTCCATAACGACAGCGGTTCGCCTGCATTCTGCCGCGTCATTTGGC 
GTAAGCATATTGCCGAAATGACCGACCTTTCGGCAGCGGAACGCGGCGAATTGATGGAAA 
TGGTGTACAAAGTCGAAGCCGCTATGCGCCAAGTGTTCCGGCCGGCAAAAATCAACCTCG 
30 CCAGCTTGGGCAATGTCGTGCCGCACCTGCATTGGCATATTATCGCCCGCTTTGAAAACG 
ATGCGTCTTTCCCCGCGCCGATTTGGGCAAACCCCGTCCGGAAACACGGTATGACCCTGC 
CGCAAGATTGGACGGAACAGCTTAAAAAGCTGCTTTAAGCCCGCCGATGCCGTCTGAAAC 
CGTATGAAAGGGAAATTATGACCGAACCGACCTCCCGCCGCCGTTTTCTGAAAACCTGCA 
CCGCCGCTGCCGGCGCGGGGCTGCTTCAGGCTTGCGGCACATCCGCCACATCCGTTCCGC 
35 CCCTTCCCTCTTCCCATTCCGTTGTGAAAGCCCGAACCGTGCCTCTCCAAACGCCACGCC 
GTCAAAGTTCGGACGGCAACCTTCTGCGCGTTGTCGCTTCGTCAGGATTTGCCGAAGACA 
CCAACCGCGTCAACACAGCCTTAACCCGCCTTTACAATGTCGGTTTTACCGTAACCAACC 
AACAGGCGGGCAGCCGCCGTTTCCAACGGTTTGCCGGCACGGACACGCAACGTGCCGCCG 
ATTTCCAAGAGGTCGCTTCCGGCCGCGTCGCCACGCCTAAAGTGCTGATGGGTTTGCGCG 
40 GCGGTTACGGTGCGGCGCGGATTCTGCCGCATATCGATTTTGCTTCGCTCGGCGCAAGGA 
TGCGCGAACACGGCACGCTCTTTTTCGGATTCAGCGACGTATGCGCCGTCCAGCTGGCAT 
TGTTGGCAAAAGGCAATATGATGAGTTTTGCCGGCCCGATGGCTTATAGCGATTTTGGCA 
AACCCGCCCCCGGTGCGTTTACGATGGATGCCTTTATCAAGGGTGCAACCCAAAACCGCC 
TGACCGTTGATGTTCCTTATATCCAACGCGCCGATGTCGAAACCGAAGGCATATTGTGGG 
45 GCGGCAACTTAAGCGTCCTCGCCTCGCTCGCCGGCACGCCTTATATGCCCGACATCGACG 
GCGGCATTTTGTTCCTCGAAGATGTCGGCGAACAGCCCTACCGCATCGAACGTATGCTCA 
ATACGCTGTATCTTTCGGGTATTTTGAAGAAACAGCGCGCCATCGTGTTCGGCAATTTCC 
GTATGGAAAAAATTCGAGATGTCTATGATCCGTCTTATGATTTTTCTGCCGTTGCCAACC 
ATGTTTCGCGCACGGCGAAAATCCCCGTGCTGACGGGCTTCCCGTTCGGACACATTGCCG 
50 ACAAAATCACTTTCCCTCTAGGCGCGCACGCCCGAATCCGTATGAACGGAAACAGCGGTT 
ATTCGGTCGCGTTTGAAGGCTACCCCACACTCGATGCGTCCGCCCTGACTTTGGATACCC 
TGCTCCCACCGCCGGATTTGCCCATCTTCCCCGAAAGCGGTGTTGCCGATATTTCGGAAT 
AAACCCGCAAACGGACAAATGCCGTCTGAAGCCTTCAGACGGCATTTCCCAAGACGGCGG 
CAGATTACAGCAATGCCCGAATATCGGCTTCGATTTCTTCGGGCGTAACACTAGGCGCAA 
55 AACGCTCGACCACTTCGCCGTCGCGGTTGACGAGGAATTTGGTAAAGTTCCATTTGATGT 
CGCCTTCGTCGCGCTTCTCTCCCAAAGCTGCGAGCTTCAACACGAAATCTTTAAACAGAT 
GATTGCCTTTATCTTGCGGTTTGACGGATTTCAGGTAGGCATACAAGGGCGCGGTATTTG 
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CTCCATTGACTTCGATTTTGTCGAAAATCTTAAACTTCGTGCCAAACTTCATCATACACA 
CTTGGGCAATTTCTCCGCTGCTTTCGGGAGCCTGTTCGCGGAACTGGTTGCACGGAAAAT 
CCAAAATCTCCAAGCCTTCTGCGGTATATTGTGCATACAGCTTCTGCAAAGCCTCGTATT 
GCGGGGTCAGACCGCAACGCGTTGCCGTGTTGACAATCAGCAGAACCTTGCCGCGATAGC 
5 CTGACAAATCAACCGCATTGCCTTCTGCATCTTTCATTTGAAAATCGTAAATACCCATTT 
TTATCCTTATCTGATGTAAACCGATGCCATCTGAAACGTGCTTCAGACGGCATGAAAGCA 
GCAATTGTATAGCCGATTAAAATAAAAAATCCACATCCTTTTCCATTCCCGTCCCAATCC 
GCAATAAAAAACTGCACCCGAAAACGGGTGCAGTTGCTCATTTCATACCGCAAAACTTAT 
TTGTCGCGGCCGAATACGATTTTAGTGGCTTGGATGGCGACACAGATTGCACCGCCGATA 
1 0 AAGACCAAGTCAGCTGCCGTACGTACCCAACGCAAGGTATCGAGGATTTCCATTTGCAGG 
AACTCTTCGCTGCGGGCATACCACAGACCGTGCGTGATGGAGGCGTATGCCTGAATCGCG 
CCGACAGGCAGCAGGCTGATGGCAATCATACCGGCCAAGCCGCCGTTGAGCAGCCAGAAG 
CCCCAAGTCATCAGTTTGTCGTCAAACTGCGCGTTCGGTTTCAAATAACGGGCAACCAGC 
AATACGAAGCCCAATGCCAAG-AAACCGTACACACCAAACAAGGCGGCGTGCGCGTGAACG 
1 5 GCAGAAGTGTTCAAACCTTGGATATAGAACAGGGAAATCGGCGGATTGATCAGGAAGCCG 
AATACGCCGGCACCGATCATATTCCAAAAGGCGACTGCCACGAAGCACATCAGCGGCCAA 
CGCAGGCGTTTCGCCCAGTCGGACAGGTGTTGGTAAGACCAGTGTTCGTATGCTTCACGG 
CCCAGCAACACCAGCGGCACGACTTCCAAAGCGGAGAAGCAGGCACCGATTGCCATAGAG 
GCGGAGGTAGAGCCGGAGAAGTACAGGTGGTGCAGCGTGCCCGGAACGCCGCCCAACATA 
20 AAGATGGCGGCAGCGGCCAAAGTGGAGGCAGTGGCGGTACTGCGGCGGACAAAGCCCATA 
TTGTAGAAGACAAAGGCAAAGGCGGCAGTGGCAAATACTTCGAAGAAGCCTTCTACCCAC 
AGGTGAACCACCCACCAACGCCAGTATTCCATAACGGCAATCGGGGATTTTTCGCCATAG 
AACAGGCCTGGTGCGTAGAATACGCCCACACCGACCATAGAAGCTACGAAGATAGCCAAC 
AGGTTTTTGTCCACGCCTT.TTTCTTTAAAGGCGGAAACCGTGCAACGCAACATCAGGAAC 
25 AGCCATAACAGCAGACCGACCATCAAAAGGAGTTGCCAGAAACGTCCCAAATCGAGGTAT 
TCGTAACCTTGGTGTCCGAACCAGAAGTTAAATTCCGGGGGAAGGATGTGCGTCAACGCG 
AAGAAGTTGCCCGCGTAAGAACCGCCGACCACGATGAAGAGGGCGATATAGAGGAAGTTT 
ACGCCGGCACGTTGGAACTTGGGATCTTTACCGCCGTTGACAATCGGCGCGAGGAACAAA 
CCTGCCGTCAAAAAGCCGGTTGCAATCCAGAAGATGGCGGATTGGATGTGCCAAGTACGG 
30 GTCAGGGCGTAGGGGAACCAGTCGGACATTTCAAAGCCCAACGCCTCGTCAATGCCGTAG 
AAACCCTGGCCTTCGACGGTGTAGTGCGCGGTCAGTCCGCCCAGCAATACTTGTACCACA 
AACAGGGCGACCGTCAGGAAGACGTATTTGCCCAATGCTTTTTGCGAAGGGGTCAGTTGG 
ATTTTGGAAATCGGGTCTTCAGACGGCACTTCCACTTCCTCGTGTTTGGTCAGGAAGGAA 
TAACCCCACATCAGCAAACCGATGCCCATCAGCAGAAGAACAACGCTGGTGAATGACCAC 
35 ATATAGTTTTCAGTGGTCGGTACGTTGTTGATCAAAGGTTCGTGCGGCCAGTTGTTGGTG 
TAAGTAAAATTCTCGTCAGGACGGTTGGTCGAAGCAGACCAAGAAGTCCAGA^.GAAGAAG 
TTGAACAGTTTTTCACGCGCTTCTTGGCTTGGCAATGTATTGTTTTTCATTGCAAAGTGT 
TCGCGAGTGGTTTGGAACTTAGGATCGTCGCTGTACACACCGTGGTAGTAAGGCAGGATG 
CTTTCGATGGCTTTCACGCGCGTATCGCTGATGACGACGCTGCCGTCTTCCTTCACGCGG 
40 CTTTGATTGCGGTATTCGTCGGCCAGGCGTGTTTTCAAGACGGCTTGTTCCTCGGGGGAA 
ACCTCGTCGAATTTTTTGCCGTAAGTCTGTTGCGCGGTCAAATCCAACCAGGCAACCAAC 
TCACGATGCAGCCAGTCCGCCGTCCAGTCCGGAGCCTGATATGCACCGTGACCCAAAATC 
GAACCGACTTCCATACCGCCGGTAGTCTGCCATGCAGACTGACCTGCCAAAATATCGTCT 
TTCGTCATCAAGACCTTGCCGGATGCGGAAACGACCTGTTCGGGGTAAGGCGGGGCTTTT 
45 TTGTAAACCTCGCTGCCCATATAGCCAAGAATGGTAAAGCATAcCGGGTAC 



The following partial DNA sequence was identified in A^. meningitidis <SEQ ID 2>: 
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CGAGGCGCAGATACAGGTTTTGGAAGATGTGCACGTCAAGGCGAAGCGCGTACCGAAAGA 
CAAAAAAGTGTTTACCGATGCGCGTGCCGTATCGACCCGTCAGGATATATTCAAATCCAG 
CGARAACCTCGACAACATCGTACGCAGCATCCCCGGTGCGTTTACACAGCAAGATAAAAG 
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CTCGGGCATTGTGTCTTTGAATATTCGCGGCGACAGCGGGTTCGGGCGGGTCAATACGAT 
GGTGGACGGCATCACGCAGACCTTTTATTCGACTTCTACCGATGCGGGCAGGGCAGGCGG 
TTCATCTCAATTCGGTGCATCTGTCGACAGCAATTTTATTGCCGGACTGGATGTCGTCAA 
AGGCAGCTTCAGCGGCTCGGCAGGCATCAACAGCCTTGCCGGTTCGGCGAATCTGCGGAC 
5 TTTAGGCGTGGATGACGTCGTTCAGGGCAATAATACCTACGGCCTGCTGCTAAAAGGTCT 
GACCGGCACCAATTCAACCAAAGGTAATGCGATGGCGGCGATAGGTGCGCGCAAATGGCT 
GGAAAGCGGAGCATCTGTCGGTGTGCTTTACGGGCACAGCAgGCGCAGCGTGGCGCAAAA 
TTACCGCGTGGGCGGCGGCGGGCAGCACATCGGAAATTTTGGCGCGGAATATTTGGAACG 
GCGCAAGCAGCGATATTTTGTACAAGAGGGTGCTTTGAAATTCAATTCCGACAGCGGAAA 
1 0 ATGGGAGCGGGATTTACAAAGGCAACAGTGGAAATACAAGCCGTATAAAAATTACAACAA 
CCAAGAACTACAAAAATACATCGAAGAGCATGACAAAAGCTGGCGGGAAAACCTGGCAAC 
CGCAATACGACATTACCCCCATCGATCCGTCCAGCCTGAAGCAGCAGTCGGCAGGCAATC 
TGTTTAAATTGGAATACGACGGCGTATTCAATAAATACACGGCGCAATTTCGCGATTTAA 
ACACCAAAATCGGCAGCCGCAAAATCATCAACCGCAATTATCAGTTCAATTACGGTTTGT 
1 5 CTTTGAACCCGTATACCAACCTCAATCTGACCGCAGCCTACAATTCGGGCAGGCAGAAAT 
ATCCGAAAGGGTCGAAGTTTACAGGCTGGGGGCTTTTAAAGGATTTTGAAACCTACAACA 
ACGCGAAAATCCTCGACCTCAACAACACCGCCACCTTCCGGCTGCCCCGCGAAACCGAGT 
TGCAAACCACTTTGGGCTTCAATTATTTCCACAACGAATACGGCAAAAACCGCTTTCCTG 
AAGAATTGGGGCTGTTTTTCGACGGTCCTGATCAGGACAACGGGCTTTATTCCTATTTGG 
20 GGCGGTTTAAGGGCGATAAAGGGCTGCTGCCCCJUW^TCAACCATTGTCCAACCGGCCG 
GCAGCCAATATTTCAACACGTTCTACTTCGATGCCGCGCTCAAAAAAGACATTTACCGCT 
TAAACTACAGCACCAATACCGTCGGCTACCGTTTCGGCGGCGAATATACGGGCTATTACG 
GCTCGGATGACGAATTTAAGCGGGCATTCGGAGAAAACTCGCCGACATACAAGAAACATT 
GCAACCGGAGCTGCGGGATTTATGAACCCGTATTGAAAAAATACGGCAAAAAGCGCGCCA 
25 ACAACCATTCGGTCAGCATTAGTGCGGACTTCGGCGATTATTTCATGCCGTTCGCCAGCT 
ATTCGCGCACACACCGTATGCCCAACATCCAAGAAATGTATTTTTCCCAAATCGGCGACT 
CCGGCGTTCACACCGCCTTAAAACCAGAGCGCGCAAACACTTGGCAATTTGGCTTCAATA 
CCTATAAAAAAGGATTGTTAAAACAAGATGATACATTAGGATTAAAACTGGTCGGCTACC 
GCAGCCGCATCGACAACTACATCCACAACGTTTACGGGAAATGGTGGGATTTGAACGGGG 
30 ATATTCCGAGCTGGGTCAGCAGCACCGGGCTTGCCTACACCATCCAACATCGCAATTTCA 
AAGACAAAGTGCACAAACACGGTTTTGAGTTGGAGCTGAATTACGATTATGGGCGTTTTT 
TCACCAACCTTTCTTACGCCTATCAAA7VAAGCACGCAACCGACCAACTTCAGCGATGCGA 
GCGAATCGCCCAACAATGCGTCCAAAGAAGACCAACTCAAACAAGGTTATGGGTTGAGCA 
GGGTTTCCGCCCTGCCGCGAGATTACGGACGTTTGGAAGTCGGTACGCGCTGGTTGGGCA 
35 ACAAACTGACTTTGGGCGGCGCGATGCGCTATTTCGGCAAGAGCATCCGCGCGACGGCTG 
AAGAACGCTATATCGACGGCACCAACGGGGGAAATACCAGCAATTTCCGGCAACTGGGCA 
AGCGTTCCATCAAACAAACCGAAACTCTTGCCCGCCAGCCTTTGATTTTTGATTTTTACG 
CCGCTTACGAGCCGAAGAAAAACCTTATTTTCCGCGCCGAAGTCAAAAATCTGTTCGACA 
GGCGTTATATCGATCCGCTCGATGCGGGCAATGATGCGGCAACGCAGCGTTATTACAGCT 
40 CGTTCGACCCGAAAGACAAGGACGAAGACGTAACGTGTAATGCTGATAAAACGTTGTGCA 
ACGGCAAATACGGCGGCACAAGCAAAAGCGTATTGACCAATTTTGCACGCGGACGCACCT 
TTTTGATGACGATGAGCTACAAGTTTTAAAGGCAGCCCGCATTTTGTAGAAAACCGCAAT 
GCCGTCTGAAAGCCCTTCAGACGGCATTTGTTTCCCCAAACGCATCATCCTGCCGCAAGC 
CTATGCCAATCCGTTTTATCGCATCGGCAACTCAAAGAAAAATCCATTTCATTCCCACGC 
45 AGGGAAGCCGGTTTTTGATTTCGGTTATTTTTGGTTGTTTCGGGTAATTTATGAGTCGTC 
ATTCCCGCAAAAGCGGGAATCAGTTTTTTTAAGTTTCAGCCATTTCCGATAAATTCCTGT 
GGCTTTAGCTTTCCGGATTCCCACTTTCGTGAGAATGACGTGGTGCAGGTTTCCGTACGG 
ATGGATTCGTCATTCCCGCGCAGGCGGGAATCTAGACCGTTCGGTTTCGGTTTTTTTGGT 
TAGTGCCGCAACATTAAATTTCTAGATTCCCACTTTCGTGGGAATGACGGCGGAGCGGTT 
50 TCTGCTTTTTCCAATAAATGCCCCCAACCTAAAATCCGTCATTCCCGCGCAGGCGGGAAT 
CTAGACATTCAATGCTAAGGCAATTTATCGGAAATGACTGAAACTCAAAAAACTAGATTC 
CCACTTTCGTGGGAATGACGTGGTGCAGGTTTCCGTATGGATGGATTCGTCATTCCCGCG 
CAGGCGGGAATCTAGTCCGTTCGGTTTCGGTTTTTTTGGCTAATGCCGCAACATTAAATT 
TCTAGATTCCCACTTTCGTGGGAATGACGGCGGAGCGGTTGCTGTTTTTCCCAATAAATG 
55 CCCCCCAACCTAAAATCCGTCATTCCCGCGCAGGCGGGAATCTAGTCCGTTCGGTTTCGG 
TTTTTTTGGCTAGTGCCGCAACATTAAATTTCTAGATTCCCACTTTCGTGGGAATGACGG 
CGGAGCGGTTTCTGCTTTTCCCAATAAATGCCCCCAACCTAAAATCCGTCATTCCCGCGC 
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AGGCGGGAATTTAGACATTCAACGCTAAGGCAATTTATCGGAAATGACTGiW^CTCAAAA 
AACTGGATTCCCTCTTTCGTGGGAATGACGTAGTGCAGGTTTCCGTACGGATGGATTCGT 
CATTCCCGCGCAGGCGGGAATCTAGACATTCAATGCTAAGGCAATTTATCGGAAATGACT 
GAAACTCAAAAAACTGGATTCCCGCTTTCGTGGGAATGACGCGATTAGAGTTTCAAAATT 
5 TATTCTAAATAGCTGAAACTCAACGCACTGGATTCCCGCCTGAGCGGGAATGACGAAGTG 
GAAGTTACCCGAAACTTAAAACAAGCGAAACCGAACGAACTGGATTCCCACTTTCGTGGG 
AATGACGGAATGTAGGTTCGTGGGAATGACGGGATGCAGGTTTCCGATGGATGGATTCGT 
CATTCCCGCGCAGGCGGGAATCTAGACATTCAACGCTAAGGCAATTTATCGGAAATGACT 
GAAACTCAAAAAACTGGATTCCCACTTTTGTGGGAATGACGCGATTAGAGTTTCAAAATT 
10 TATTCTAAATAGCTGAAACTCAACGCACTGGATTCCCGCCTGAGCGGGAATGACGAATTT 
CAGGTTGCTGTTTTTGGTTTTCTGTTTTTGTGAAAATAATGGGATTTTAGCTTGTGGGTA 
TTTACCGGAAAAAACAGAAACCGCTCCGCCGTCATTCCCGCGCAGGCGGGAATCTAGTCC 
GTTCGGTTTCGGTTTTTTTGGCTAGTGCCGCAACATTAAATTTCTAGATTCCCACTTTCG 
TGGGAATGACGGGATGTATAGTGGATTAACAAAAACCAGTACGGCGTTGCCTCGCCTTAG 
1 5 CTCAAAGAGAACGATTGTCTAAGGTGCTGAAGCACCAAGTGAATCGGTTCCGTACTATTT 
GTACTGTCTGCGGCTTCGTCGCCTTGTCCTGATTTTTGTTAATCCACTATAAATTTAATC 
CACTATATTTTTTGTTCCAAAGTCAAAATATGCCGTCCGAACATTCGGGCGGCAGACAAA 
ACGGCACTGCCCGATAAAGGCAGTGCCGTTGTCCGTTTCAAACCGTGAAACATCAGCCCA 
AATTAAAGGCTTTATGCAATACCCTGGTTGCCAGTTCCATGTATTTTTCATCAATCAATA 
20 CGGAAACTTTGATTTCGGAGGTGGAAATCATTTGGATGTTGATACCCTCTTCGGCGAGCG 
TGCGGAAGATTTTGGCGGCTACACCGACGTGCGAACGCATACCCAAACCGACTGCGGAGA 
CTTTGCATACGGTGTCGTCGCCATCAATAGAAGCCGCGCCGATACTGTCTTGGCGTTCCG 
ACAGGATTTCCTiAAGTCTGCTTGTAATCGCCGCGCGGTACGGTAAAGGAAAAATCGGTTG 
TGCCTTCGCTGCCGACATTTTGGATAATCATATCGACTTCGATGTTGGCATCGGCAACCG 
25 CGCCTAAAATCTGATAGGCGACGCCAGGTTTGTCGGGTACGCCGCGCACGTTGATGCGGG 
CTTGGTTTTTATCGAATGCGATACCGGTTACGGCAGCTCTTTCCATGTTGTCGTCCTCTT 
CAAAGGTAATTAAGGTGCCATTGCCGCCGTCTTGCAGGCTGCTCAGTACGCGCAGGCGCA 
CTTTGTATTTTCCGGCGAATTCTACTGAACGGATTTGCAAAACTTTCGAACCGAGGCTTG 
CCAGTTCGATCATTTCTTCAAATGTAACCGTATCCATGCGGCGCGCTTCGGGTACGACGC 
30 GGGGGTCGGTTGTGTAAACGCCGTCTACGTCGGTATAGATTTGGCACTCGTCGGCTTTGA 
GCGCGGCGGCAAGCGCGACGGCGGAAGTGTCGGAACCGCCGCGTCCGAGCGTGGAAATAT 
CGCCTTCACTGCTGATGCCTTGGAAGCCGGCAACGATGACGACTTTGCCGGCGGTAAGGT 
CGGCACGCATTTTTTCGTCATCAATGCTTTCGATGCGGGCTTTGGTGTGGGCGGTATCGG 
TTTTGAGGGCGACCTGCCAGCCTGTGTAGCTTTTGGCATCCACGCCGATGTCTTTCAATG 
35 CCATCGCCAAAAGGCCGATGGTTACTTGTTCGCCGGTAGCTAAGACGACGTCCAGCTCGC 
GCGGATCGGGATGCTCTTGCATTTCGTGCGCCAGTGCGACCAGTCGGTTGGTTTCGCCGC 
TCATGGCGGATACGACGACTACGATGTCGTGTCCTTCGGCGCGGGCTTTGGCGACACGTT 
TGGCTACGTTTTTGATGCGTTCGGGCGAGCCTACTGATGTGCCGCCGTATTTATGTACGA 
TTAACGCCATGTTTCGTGCTTTCTTGTGGGGGTTGTCGGGCAGCTTGGTTTGCTGGAAAA 
40 AGGGTTATTATTACTATTTTTTACATGGAATTCAAGAACGGACTGCGCTTTCCCGCCTGC 
CGTTTGACAGCGGTCAGCGAAAAACCTGTTCTTTCAGATTGTTGACAAAATGCCGTCTGA 
ACGGTTTTCAGACGGCATCCGGACGACAATCAGGCGGCGGACAACGCATTTTGCTGGTGT 
TGCAGCAGTTCGCCTATGCCTTTTTGCGCCAGTGCAACCAGTTTGCCCAATTCGTCCAAA 
CTGAACGGCGCGTCTTCCGCCGTCCCCTGTATTTCGATGATTTTTCCCGATGCGGTCATG 
45 ACGATATTCACATCACTGTCGCAACCGGAGTCTTCGGGATAATCCAAATCCAAAAGCGGC 
ACGCCGTTCACTACGCCTACTGACACAGCGGCAACGGCTTCGCGGATGGGGTTTTCACTC 
AAAATGCCGTCTGAAACCAGTTTGCCGACGGCGATTTGCAGCGCGACAAACGCACCGGTA 
ATCGAAGCCGTGCGCGTACCGCCGTCTGCCTGAATCACATCGCAGTCAATCAAGATTTGT 
CGTTCACCGAGTTTTTCCATATCCACGACCGCGCGCAGGGAACGCCCGATCAAACGTTGG 
50 ATTTCTTGTGTGCGCCCGGACTGTTTGCCCGCCGAAGCTTCGCGGAGCATCCGGGAAGCA 
GTTGAGGCAGGCAGCATCCCGTATTCCGCCGTTACCCAGCCTTGGTTTTTACCGCGCAGA 
AACGGCGGGACGTTTTCATCTATGGAAGCGGTACAAATCACTTTGGTATTGCCGCATTCA 
ATAAGGCACGAACCGTCCGTATGCGGCAGGAAATGAGGGGTGATTTTGATATCGCGCAGG 
CTGTCGGCGGCGCGCGAGATGCGGATGTAATCAGGCATACTGCCCTCCCGTTAAAAACAG 
55 ATAAATTAAAAAGCCTTAAATATGAAAAATCACATTTAAGGCCTTCAAACTGAAAATTTC 
TACGCCTCTTCGGCTTTGCTGCGGATAATCAAAAGCGGCAGGTGGCTTTGGCGCATTACC 
GTTTCGGCAAAACTGCCCATTAAAAGGTGCATCAGCCCGGTACGTCCGTGCGTACCCAAC 
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ACCAGCAGGTCGGCACCGTTTTCATCGGCATAATCAACCAAATCCTGCGCCATTTCACGC 
GCACCCTTATTGGCAACCAGCAGGTGTTTGACGGTATTTTCCACACCCAGTTCCTGGGCG 
GTGCGCTCGGCGGCATCCAAAACTTCGTTGCCTTGCGCGACGGCGGCGGCTTCGTAGCTT 
TCGTGTTGCAAAAATTCGGGGGCGAGTGCCATATATTCGGCAGGATToGCAACGTGCACC 
5 AAAGTCAGGCGCGCACCGTTGACCCCGGCAAGCTCGGCGGCATGTTTCAGGGCATTGATG 
GACGTTTCACTGCCGTCAACGGCAACAACCAAATGTTTGTACATATCGTATTCTCCTTTT 
GCACCGCCTCGCGGTGCCCTCTTGTCGGATGGGCGCAGGGACAGTTTGCGCTGTTTCATT 
ATAGACCCGCCGTCGGGCTTTATACAACAGCCGAACAGCCCGACCGCTTTCCAGTATAAT 
ATGCCGCTTCCGTGCAGTCAGGCATTTTTTGCCGGCTTTCGTTCACTTTTTGATTTGACG 
10 CAATCTTGCAGGATTCGACCATGTCCGACAACGCTTTGACCTCTTCGCGACGCTTCGGCG 
GCATCGCCAGACTCTACGGAGACTCTGCCTTGGCGCACTTTTCACAGGCACACGTCTGCG 
TAGTCGGCGTGGGCGGTGTCGGCTCGTGGGCGGTCGAGGCTTTGGCGCGGACGGGCATCG 
GACGTTTGACTTTGATTGATTTGGACAACGTTGCCGAATCGAATGTCAACCGCCAGCTGC 
ACGCCCTGACCGGCGACTTCGGCAAAGCAAAAGTTACCGCCTTGCGCGAACGCATTACAC 
15 AAATTAATCCGCAATGCGAAGTGTTTGAAATTGAAGATTTCGTTACCGAAGACAATTTGC 
CGGAATACTTCGGAAAAGGTTTTGATTTCGTCATCGACGCGATCGACCAAGTGCGCGTCA 
AAGCAGCAATGGCGGCTTATTTTGTGGAACGCAAACAACCGTTTGTCCTCAGCGGCGGCG 
CGGGCGGACAAAAAAATCCGGCGTTAATCCAAACCGCCGATTTGAGCCGCGTAACCCACG 
ACCCGCTGCTTGCCAACCTGCGCTACACCTTGCGGAAACGCTACGGATTCAGCCGCGATA 
20 CGAAAGCAAATATGCGCGTGCCTTGCGTGTATTCGACCGAAAATATCGTGCCGCCGCAGT 
CTAGGGAGGCTTGTTCGGCAGATGCCGCTCCGCAAGGCTTGTCGTGCGCCGGCTACGGTG 
CAAGCATGCTCGTTACCGCTTCGTTCGGGCTATATTGCGCACAGGCGGCGGTGGAACACA 
TCGCAGACAAAAAATAAGCAATGCCGTCTGAAACAGGATTCAGACGGCATTTGAACAAAC 
TATGGTTATGATTTAAGACAACAAAGGATACGGATAAAAAATAACATAAAATATATGATT 
25 CCTAATAATATACCAAGTATCGGAGAGCTATTTAATGGAATTCGTTAATAATTTAGTTAT 
TTTTTCATTTTTATTACTAATGCTTATTCCGATATTTTTTGTAGTATATGGTATATACCA 
TAAGATACGTTATCGCAAAATATGTATCCTAAGAACAAGTTTTATATTATTAGTGGTAAT 
ACTTTGCAGTATGTATTACATATATTGCCGTTATCTTGACCAACAAAAAGTAGCTTATTA 
TTGCATAGATGAACAATGTATTTCTATTGTTCATCTATACAAAGATTATGGTATAAACTC 
30 TCCCACATATGCGAGAATTTACGCAGGAAAAATATTGTTTAGATTTCAAGTAAGAGCTAA 
AAATTACGCTGAATTACTTATGGAAGATGATATATCAATTAGTAAAAAAATTTTGGGGAA 
TAAATTTATCATTTATGGGTCGCTACCTGTAATATACGGTAATGTAGATAATATTGAAGT 
AAAAGAAGCTACTGGTTATATAGATAGATCCAGTACTGATTATATTGTCTCAAGAAACTT 
AAAATTCAGACATTTATATTAATTAAGAGGTTTTAGCAAGAGTGCCGTC7WVATATAGGG 
35 CGCATCATCGAATTCGCGAAAGACAAACGCTACGATGAACGTTTCAJ^.GGATTTGAAAAAA 
GAATCCATAGGCTATCTGAACCGGCATCCCGGTTTGGTGTCCGACTACCTGAAGGCGGCA 
ATCAAGCTGTCGGTTCAGAAAAACCAACATCAGCACGCCTAAAACCGTATTCACAACCTG 
CTCCTTTTCAAAACATTTGCATTTAAAAGCCGTTATAATGCCGTCTQAACATCTGCCCGA 
CCACATTATACGTGAATGTCGGCAGATTGTTTTCTTTTGTAAACTTATATTAAAATCCAC 
40 TTACCGATTCACGCCATGCCGCCCATCCCTGCCCCATCTGCACCATCCGAGCACACTGTC 
GCATGGGTATTCGGCCAACCCGTTACCGATTTGCCCCAGGATTTGTTTATTCCGCCCGAT 
GCATTGAAAGTCGTATTGGGCAGCTTCCAAGGCCCTTTGGATCTACTGCTGTATCTGATC 
CGCAAACAGAATATCGACGTACTGGATATTCCGATGGTGAAGATTACCGAGCAGTATCTG 
CACTACATCGCCCAAATAGAAACCTATCAGTTTGATTTGGCGGCGGAATATCTTTTGATG 
45 GCAGCAATGCTGATTGAAATCAAATCGCGCCTGCTGCTGCCGCGTACCGAAACCGTCGAA 
GACGAAGAAGCCGACCCGCGTGCCGAGTTGGTGCGCCGCCTGCTGGCTTACGAACAGATG 
AAGCTGGCGGCGCAGGGTTTGGACGCGCTGCCCCGAGCCGGACGGGATTTCGCGTGGGCT 
TACCTGCCGCTGGAAATTGCCGTCGAAGCCAAGCTGCCCGAAGTCTATATTACCGACTTG 
ACGCAAGCGTGGCTGGGTATTTTGTCTCGGGCAAAACACACGCGCAGCCACGAAGTAATC 
50 AAAGAAACCATCTCCGTGCGCGCGCAAATGACGGCAATCCTGCGCCGTTTGAACGGACAC 
GGAATATGCAGGTTTCACGACCTGTTCAATCCCAAACAGGGCGCGGCTTACGTGGTCGTC 
AACTTCATCGCACTGTTGGAGCTTGCCAAAGAAGGATTGGTCAGAATCGTGCAGGAAGAC 
GGTTTCGGAGAAATCCGAATCAGCCTCAATCATGAGGGGGCGCATTCAGACGGCATTTCC 
GGCACACGAGGCGGGCGCGATGTGTTCTAATACGCCCCAAGCCGCCACCAAAAATCGGGA 
55 GACACGCCATATGACCGGCATCATACATTCGCTGCTTGACACCGACCTCTACAAATTCAC 
TATGCTGCAAGTGGTTCTGCACCAGTTTCCGCAGACGCACAGCCTTTACGAATTCCGCTG 
CCGCAACGCCTCGACCGTCTATCCGCTTGCCGACATCAGGGAAGACTTGGAAGCCGAACT 
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CGACGCGCTCTGCCAACTACGCTTCACCCACGACGAACTCGGCTATCTGCGCTCCCTGCG 
TTTCATTTW^GCGACTTTGTCGATTATCTCGAACTCTTCCAGCTCCAACGCCGCTTTGT 
CGAAATCGGCACAGACGATAAAGACCGTCTGAACATCCGCATCGAAGGTCCGATGATACA 
GGCGATGTTTTTTGAAATCTTCATCCTCGCCATTGTCAACGAACTTTACTTCCGCCGCCT 
5 GGAAACCCCTGCAGTCATAGAAGAAGGCGAACGCCGGCTTCAAGCCAAAGCCGCGCGCCT 
CAAAGAAATCGCCGCCGCACAAAACCCCGACGAACCGCCCTTCCTGATTTCCGACTTCGG 
CACGCGCCGCCGCTACAAGCTCGCGTGGCAGGAACACGTCATCCGCACCCTGCTTGAAGC 
CGCCCCCGGCATCGTACGCGGCACCAGCAATGTCTTTCTCGCCAAAAAACTCGGCATCAC 
CCCCATCGGCACCATGGCGCACGAGTTCCTGCAGGCATTCCAGGCCCTCGACGTACGCCT 
10 GCGGAATTTCCAAAAGGCCGCGCTCGAAAGCTGGGTGCACGAATACCGGGGCGATTTGGG 
CGTTGCCCTGACCGACGTGGTCGGTATGGATGCCTTCCTGCGCGATTTCGACCTCTATTT 
CGCCAAACTTTTCGACGGGCTGCGCCACGACAGCGGCGACCCTTACGTTTGGGGCGACAA 
AGCCTACGCCCACTATCAAAAGCTCAAAATCGACAGCCGCACCAAAATGCTGACCTTCTC 
CGACGGGCTGGACATCGAACGCTCTTGGGCATTGCACCAATATTTCAAAGACCGCTTCAA 
1 5 AACCGGCTTCGGCATCGGCACCAACCTCACCAACGATATGGGGCATACGCCCTTGAATAT 
CGTCTTGAAACTGGTCGAATGCAACGGGCAGTCCGTCGCCAAGCTGTCCGACTCTCCGGG 
CAAAACCATGACCAACAACAGCACCTTCCTCGCCTACCTGCGCCAAGTGTTCGACGTACC 
CGAACCCGAAACGCCGTAAACCGGCAGAAAAAGCGCACAATTCCTGTTTCTGCCGCATAA 
AATCTTTTAAAATACCGCCTGATTTGAATTTAACCGAAAGACCGAACTTCATGAACCTAC 
20 ATCAAACCGTCGAACACGAAGCCGCCGCCGCCTTTGCCGCCGCAGGCATCGCCGACAGCC 
CTATTGTTTTGCAGCCGACCAAAAACGCCGAACACGGCGATTTCCAAATCAACGGCGTGA 
TGGGTGCGGCGAAAAAAGCCAAACAAAACCCGCGCGAGTTGGCGCAAAAGGTCGCCGAAG 
CATTGGCGGACAACGCCGTGATTGAAAGCGCGGAAGTCGCCGGTCCGGGCTTCATCAACC 
TGCGCCTGCGCCCCGAATTTCTCGCGCAAAACATTCAGACGGCCTTGAACGACGCTCGTT 
25 TCGGCGTGGCAAATiACCGACAAACCGCAAACCGTCGTTATCGACTATTCTTCGCCCAATC 
TGGCGAAGGAAATGCACGTCGGCCACCTGCGTTCCAGCATCATCGGCGACAGCATTTCGC 
GCGTGTTGGCATTTATGGGCAATACCGTTATCCGTCAAAACCACGTCGGCGACTGGGGTA 
CGCAGTTCGGTATGTTGGTCGCTTATTTGGTCGAGCAGCAAAAAGACAATGCCGCGTTCG 
AGCTGGCGGATTTGGAGCAGTTTTACCGCGCCGCCAAAGTGCGCTTTGACGAAGACCCTG 
30 CCTTTGCCGACACCGCACGCGAATACGTTGTGAAGCTGCAAGGCGGCGATGAAACCGTTT 
TGGCATTGTGGAAACAGTTTGTCGATATTTCGCTCTCGCACGCCCAAGCCGTTTACGACA 
CGCTGGGCTTGAAGCTGCGTCCTGAAGACGTGGCAGGCGAATCGAAATACAACGACGATT 
TGCAGCCCGTGGTCGATGATTTGGTTCAAAAAGGTCTGGCGGTTGAGGACGACGGCGCGA 
AAGTCGTGTTCTTGGACGAATTTAAAAACAAAGAAGGCGAACCCGCCGCATTTATCGTGC 
35 AAAAACAAGGCGGCGGCTTCCTCTACGCCTCCACCGATTTGGCGTGCCTGCGCTACCGCA 
TAGGCCGTCTGAAAGCCGACCGCCTGCTGTACGTCGTCGACCACCGCCAAGCCCTGCACT 
TCGAACAACTTTTCACCACTTCCCGCAAAGCAGGCTATCTGCCGGAAAACGTCGGCGCGG 
CATTTATCGGCTTCGGCACCATGATGGGCAAAGACGGCAAGCCGTTCAAAACGCGCAGCG 
GCGACACCGTGAAACTGGTCGATCTGCTGACCGAAGCCGTCGAGCGCGCCACCGCTTTGG 
40 TGAAAGAAAAAAATCCCGAATTGGGTGCGGACGAAGCCGCTAAAATCGGTAAAACCGTCG 
GCATCGGCGCAGTCAAATACGCCGACTTGAGCAAAAACCGCACCAGCGACTATGTGTTCG 
ACTGGGATGCCATGCTCTCGTTTGAAGGCAACACCGCCCCCTATCTGCAATACGCCTACA 
CCCGCGTGCAAAGCGTGTTCCGCAAAGCAGGCGAATGGGATGCAAATGCGCCAACCGTTT 
TGACCGAACCGCTGGAAAAACAGCTTGCCGCCGAGCTGCTGAAATTTGAAGACGTACTGC 
45 AAAGCGTGGCGGACACGGCGTATCCGCACTACCTCGCCGCCTACCTCTATCAAATTGCGA 
CCCTGTTCAGCCGCTTCTACGAAGCCTGTCCGATACTCAAAGCCGAAGGCGCAAGCCGCA 
ACAGCCGCCTGCAACTGGCAAAACTCACCGGCGACACGCTGAAACAAGGCTTGGATTTGC 
TGGGCATCGATGTGTTGGACGTAATGTAAAACCGCACCGCCCGATTGCGGACAACAGCCT 
CGCCATCCTTATCCGAATCTGAAAAAAGCGGCGCGATACACCGTATCCGCCGCCCCTCCC 
50 AAAATGCGAAACAAACAAACGCCAAGCAAGCAAGCAAGCAAGCAAGCAAGCAAGCAAGCA 
AGCAAGCAAAAAATTATAACCCCCTTCCTGCCGACGCACGCACTTTCCGCGCGGCGCATT 
CCCCTTTTCCCGCCCCTCAAATCCGCCTTTTCTTCAGGCAGGGTTTCAGCCCGCCTCTTT 
TCCCTGTTTTCCTTTCCCCGACACGCGTGCGCTCCCCCTGCCGCACTGTGCTGCACTTTC 
GCGCCCGGACGGCATCGTTCCGCCATCCGGTTCTCTGTTTTACATACCCCTGTTTCAGAA 
55 AGAAATGCAGATGTTTCAACACACAGGACGACACATAAAGCACCGCCCTATGTGTTGCCC 
TGATTTGGAAGGGGTTACGCCTCCCAAATAAAGTCTGATCCTGCCGCCCCGAAGGACAGA 
TGTCCGAGTGGCGAAGTTTCAACCGAAAAGGAAATACGATGAATATTCACACCCTGCTCT 
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CCAAACAATGGACGCTGCCGCCATTCCTGCCGAAACGGCTGCTGCTGTCCCTGCTGATAC 
TGCTTGCCCCCAATGCGGTGTTT-3GGTTTTGGCACTGCTGACCGCCACCGCCCGCCCGA 
TTGTCAATTTGGACTATCTTCCCGGCGCGCTGCTGATCGCCCTGCCTTGGCGTTTCGTCA 
AAATTGCCGGCGTATTGGCGTTTTGGCTGGCGGTTTTGTTTGACGGGCTGATGATGGTGA 
5 TCCAACTCTTCCCTTTTATGGATCTCATCGGCGCCATCAACCTCGTCCCCTTCATCCTGA 
CCGCCCCCGCCCCTTATCAGATAATGACCGGGCTGTTGCTGCTGTATATGCTGGCGATGC 
CGTTTGTGTTGCAGAAAGCCGCCC-CCAAAACCGACTTCCGGCACATTGCCGTCTGCGCCG 
CCGTTGTGGCGGCAGCCGGCTATrTCACCGGCCATTTGAGTTACTACGACCGGGGTCGGA 
TGGCCAATATCTTCGGCGCAAACAACTTCTACTACGCCAAAAGTCAGGCGATGCTCTACA 
10 CCGTCAGCCAGAATGCCGACTTTATTACCGCCGGCCTGGTCGATCCCGTCTTCCTCCCCT 
TGGGCAATCAACAGCGTGCCGCCACGCATCTGAACGAGCCGAAATCTCAAAAAATCCTCT 
TTATCGTCGCCGAATCTTGGGGGCTGCCGGCCAATCCCGAACTTCAAAACGCCACTTTTG 
CCAAACTGCTGGCGCAAAAAGACCGTTTTTCGGT.TTGGGAAAGCGGCAGTTTTCCCTTCA 
TCGGCGCGACGGTCGAAGGCGAAJ^TGCGCGAACTGTGTGCCTACGGCGGTTTGCGCGGGT 

1 5 TCGCACTGCGCCGCGCGCCCGACGAAAAATTTGCCCGCTGCCTCCCCAACCGTTTGAAAC 
AAGAAGGTTACGCCACCTTTGCGA7GCACGGCGCGGGCAGTTCGCTTTACGACCGCTTCA 
GCTGGTATCCGAGGGCGGGCTTTCAAGAAATCAAAACCGCCGAAAACCTGATCGGTAAAA 
AAACCTGCGCCATTTTCGGCGGCGTGTGCGACAGCGAGCTGTTCGGCGAAGTGTCGGCAT 
TTTTCAAAAAACACGACAAGGGACTGTTTTACTGGATGACGCTGACCAGCCACGCCGACT 

20 ATCCCGAATCCGACATTTTCAACCACAGGCTCAAATGCACCGAATATGGCCTGCCCGCCG 
AAACCGACCTCTGCCGCAATTTCAGCCTGCACACCCAATTCTTCGACCAACTGGCGGATT 
TGATCCAACGCCCCGAAATGAAAGGCACGGAAGTCATCATCGTCGGCGACCATCCGCCGC 
CCGTCGGCAACCTCAATGAAACCTTCCGCTACCTCAAACAGGGGCACGTCGCCTGGCTGA 
ACTTCAAAATCAAATAACAACAATGCCGTCTGAACGCACCAACAGCCTTCAGACGGCATT 

25 TTGCAGACAGACCGACCCTTCAAGCCCACTTTTTTCATCATCTCCGATAAATTGCTTTGT 
ATAGTGGATTAACAAAAACCAGTACGGCGTTGCCTCGCCTTAGCTCAAAGAGAACGATTC 
TCTAAGGTACTGAAGCACCAAGTGAATCGGTTCCGTACTATTTGTACTGTCTGCGGCTTC 
GTCGCCTTGTCCTGATTTTTGTTAATCCGCCATAAAGACCGTCGGGCATCTGCAGCCGTC 
ATTCCCGCGCAGGCGGGAATCCAGAACGTGGAATCTAAAGAAACCGTTTTACCCGATAAG 

30 TTTCCGCACCGACAGACCTAGATTCCCGCCTGCGCGGGAATGACGGGATTTTAGGTTTCT 
GATTTTGGTTTTCTGTCCTTGTGGGAATGACGGGATGTAGGTTCATAGGAATGACGTGGT 
GCAGGTTTCCGTATGGATGGATTCGTCGTTCCCGCGAAAGCGGGAATCCGGAAACCCAAA 
GCCACGGGAATTTATCGGAAAAACCGAAACCGCTCCGCCGTCATTCCCGCGCAGGCGGGA 
ATCTAGGTCTGTCGGTGCGGAAACTTATCGGATAAAACGGTTTCTTCAGATTTTACGTTC 

35 TGGATTCCCACTTTCGTGGGAATGACGGGATGTAGGTTCGTAGGAATGACGTGGTGCAGG 
TTTCCGTATGGATGGGATTCCCTCTTGCGTGAGGCTGACAGATGCCGTCTGAAAGACTTT 
CAGACGGCATAGCTTTTTCTCTTTGAATTTATAGTGGATTAACAAAAATCAGGACAAGGC 
GGCGAGCCGCAGACAGTACAGATAGTACGGAACCGATTCACTCGGTGCTTCAGCACCTTA 
GAGAATCGTTCTCTTTGAGCTAAGGCGAGGCAACGCCGTACTGGTTTTTGTTAATCCACG 

40 ATAAATTTGCCACAAAAAAGCTGCCTCAAATGAATACCCGGGCAGCTTTTTGTTGATATG 
ACTCCAATCAGCGGTGTTGCGGATTGTAACGTTTTTCCAAACGCAGGAATATCCAGCCTA 
AGAAAGTCGTCATCAACAGATAAATCAGGGCGACGGTGTAAAGCGGTTCTTCATAAACCG 
AATACCGGCCCGTAATCGTATTCTGAACATACGCCAACTCCGCCACAGCAATGACCGACA 
GCAGCGAGCTGTCTTTCAAGAGCGTGATGAACTCGCTCGCCAAAGGCGGCAGCATGCGGC 

45 GCAATGCCTGCGGCAGAATCACATAGCGCATCGCCTGCGGATAGGTCAGCCCCAAAGAAC 
GCGCCGCCTCCATCTGTCCTTTGTCTATAGACTGGATGCCCGCGCGGAAAATCTCACAGA 
TATACGCCCCCGAGTTGGCGATCAGTGCCAAAGAACCGGCAATCAGCGGCCCGTATCCGC 
GACGCAGCGCGATTGCCGCCTCGCCGCTGACCAAAATGCCGTCTGAAGGATGGACGAAAA 
ACGGAAACCACACATACGCCCAAATCACAATCTGCACAAACAGCGGCGTACCCCGGAACA 

50 GCGTAACATACAGCAGCGAAACTTTACGCAACGCCCACGCCAGCACGCGCATCGGCGCAC 
CGGCTTTTTCCAAGTGAATCAGGCGCGCCAACGCCAACAACAGACCCAATACCGAACCGC 
CCGCCGTTGCCACGACCGTCAGCCCCAAGGTCGTCAGTGCGCCGTAAAGAAACATCCAGC 
GGTATTCGTAAATAATGTCAAAACGAAAATCCATAAACCGTCCGTATCAAAAACCGGCGG 
AACTGCCGCCGTTGCAAAATAATCCGCCATTTTACCGTAAAAACCGCCGCCTGAACTTTT 

55 TTATCGCGGCAGACGGCGGTTGCGCGTCTCCGCAAAAATGCAGGGCGCGCGGTTTTCAGA 
CGGCATTTGCCGTTCAAAGCCGTGCGGTGTCTTTACCAAATGCCCAACCATTCGCCCACG 
GCATCCATCCAATCCTTATTGCCCCCGCCGCTCCTGCCTGCTCGGCGGTACGCCCACGGC 



wo 00/22430 PCT/US99/23573 



GCTTGCGGATTTTTAGCTTTCCACAATCCTTTGCGTTCCCTTTCCGCCTGAATTTGAGCG 
TCGGCATAATCGGCAAAATCCGCCTTATCCTGCTGTTCTTTAGCATAACTTTTATAATGC 
CACGCCGCCCCGTCCTGCACCTGCATCAGGTTCAAATCGGTTTTGCC3ACAGAAACCTGC 
GCCACTTCGCGCTGGTAGCGGTCGGTATCGAACACGCGCACGCTGACTTTCCTGCCTTCC 
5 GCCGCCGCGCGCAGGTTGTCGCGCGAACGCGTGCCGTAAGCCTGTTTCATCTCCGGCGCG 
TCGATATACGCCATCCGGATTTTGTGTTTCGCGCCGTCGCCGTCGATAACGTGAAGGGTG 
TCGCCGTCATAGACTTTGGACACCGTGCCTGTGTAGCGGTGGCCGGATTTCGCCGATGCT 
CGGCGGCGGGCGGGCGCGTCGGAACCCGCGTCCCCTGCCGCGCCGAGTACGTCGAGTACG 
GCAACCGCCGTCCGCACCGCCTCGCTGCCGTACCCCGTATAACCCAACGCACCCAAAAGC 
1 0 GACAGGGCGACGGGAAGCCATTTCATGATTTTTTTAATCTGCATATTTTTCAAATGCCGA 
TGCCGTCTGAACATATCGGAATCGGATTTCAGACGGCATCTTAACGTCAGGATTACCCTT 
GGCAGGGATAGATGACTTTCGCACCCTCTTCCGTCCCCAAAATCAACACATCGGCGGCAT 
CGCGGGCGAATATGCCGTTTTCGAGCACGCCGGTGATTTTGTTGATTTCGTCTTCCATCG 
TCAGCGGCTGATCGATATTCAAGCCGTGGACATCGACGATTTGGTTGCCGTAAAACGTGG 
1 5 TGTAGCCGATACGCAGTTCGGGCTGTCCGCCCATAGCGAGCAGTTTGCGCGAAACAAGAG 
AGCGCGCGCTTTCGACGACTTCCACAGGCAGAGGGAATTTGCCCAAACGTGAAACATATT 
TGCTTTCATCCGCAATGCAGATGAATTTTTCGGACGCGCTGGCGACGATTTTTTCGTTGA 
GGTGCGCGCCGCCACCGCCTTTAATCATTTGCAGGGCGTGGTTCAC7TCATCCGCACCGT 
CGATATAGACCGCCAACCCCGATACTTCGTTCAAAGAAACGACGGGAATATCGTACTGGG 
20 CAAGCAGTTCGCCGGATTTTTTGGAAGTAGATACCGCGCCTTTGATTTTTTTGCCGCTCT 
TACCCAAGGCTTCGATGAAAAAGTTGATGGTCGAGCCGGTACCGATGCCGATATATTCAT 
TTTCGGGTACGAATTCGACTGCTTTTTCGGCGGCGATGCGCTTGAGTTCGTCTTGTGTCG 
TCATATTTTTGTCCTTTGGGAAACCGTATCAACAAACAGCCGCCATCTTAACATTTTTTT 
GCACGTCCTGCCCGCCGCGTTCAAATGCGTACCAGCAATACCGCCGCCTGCGCCTCTATG 
25 CCTTCCATCCGCCCGAGATAGCCGAGTTTTTCGTTGGTTTTGCCTTTGATGTTGACGCAC 
GAAATGTCTATGCCCAAATCGGCGGCGATGTTGGCACGCATTTGCGGAATGTGCGGCGCG 
AGTTTGGGTTTCTGTGCAATCACGGTCGTATCGACATTGACCGCCTGCCAACCCTGCGCC 
TGAACGCTTTGATACGCCGCACGCAAAAGGACGCGGCTGTCCGCATCTTTGAACTCTGCG 
GCGGTGTCGGGGAAATGGCTGCCGATATCGCCCAAACCTGCCGCACCGAGCAGCGCGTCG 
30 GTAACGGCGTGCAGCAGCGCATCGGCATCGGAGTGTCCGAGCAGCCCTTTTTCAAATGGG 
ATTTCAACTCCGCCAAGTATCAGCTTTCTGCCTTCGGTCAGTTGGTGGACATCGTAGCCC 
TGTCCGATACGGATGTTCGTCATCGTTTGTGTTCCTGATGTTTTGAATTGAAGTTCAGAC 
GGCATCGAGCAGCAGCCTGACGATGTATGCGTCCTGCGGCTGCGTCAGTTTCAAATTGCG 
CACGTCGCCCTGTATCAGTAGCGGACGCACACCCAATTTTTCCACGGCGGACGCTTCATC 
35 GGTAATGCCGTCCAAGTTTTCCGCAGCCAATGCGCGGTGCAGCAGCCCGGCGCGGAAAAG 
CTGCGGCGTTTGCGCCTGCCAAAGGCTCGTCCGCTCGACGGTTGCACTAATGTTCCCACC 
GTCCGCGCACTTGAGCGTATCGGCAATGGGAATTGCCAAAATCCCGCCTTCGGCGGCGTT 
GCCCGCCTGTTCTATCAACCGCGTCAAAGCTTCAGACGGCAGGCAGCAACGCGCGGCATC 
GTGTACCAGAATATTGTCGGTTTCCGCCGCCAAACCGGTTTCCAACAGTTTTGCCACACC 
40 GTTGCGGACGGTTTCGGCGCGGGTCTGTCCGCCGTTTTTCCACACCCGAACCTGTGGAAA 
TGCCGTCTGAACCTTATCGGCAAACGTGTCTTCGGGCGAGACGACAACGACGGTCAAATC 
GACGGCCTCATGCCGTTCAAAAATCCCAATCGTATGTTCTAAAACGGTTTTGCTTCCGAT 
TTCGACATATTGCTTGGGTTTGTCCGCACCGAAACGCGCCCCGATGCCGGCGGCGGGAAT 
CAGCGCGATATTTTTGCGCTTCATGCGTCCGTCCCGCCGTTTTCAGACGGCACGGCTTCC 
45 TTGCGCCAGATACAGGCTTCGCCCAAGCCGTCCAAATATTGCCCGTGCGCCGCCAACTCG 
TTTTCGTCCGCCCTGATGACTTTCAGTTTGCCGCTGCGTTTGGTTTCGGTATGCACCACG 
GGTTTGGTTTCCATTTTTTCCTCTGCGGCCGCACCCATCAGGTCGAACTGCCGCCGCGTC 
ATAGCAAGATAGACTTCGCCCAAAAGTTCGCAGTCGATCAATGCGCC3TGCAGGACGCGC 
TTGCTGCGGTCGACGGAAAAACGGTTGCACAAGGCATCCAGGCTGGCTTTCTGCCCGGGG 
50 AACATTTCGCGCGCCATCGCCAGGGTATCGGTAACGGTACAGCCGAGTTCCTCAACGGTC 
GGCAACCCCATCCGGCGGAACTCCATATTGAGGAAGCCCACGTCGAATTTGGCATTGTGG 
ATAATCAGTTCCGCACCGCGCAGGAAATCGGCAATCTGCCTGCCGACCTCTGCAAACGGC 
GGCGCGTTTTTCCCTTCCAAAACCTGTATCGTCAAGCCGTGGACGCGTGCCGCCTCTTCG 
GGCATATCGCGCTCGGGGTGGACATAGAGGTGCAGGTTTTTGTCGGTCATTTGGCGGTTG 
55 ACCATTTCCAAACCGGCAAACTCGACCAAGCGGTCGCCGCCGTCGGCATACAGACCGGTG 
GTTTCGGTATCGAGGATGATTTGGCGTGTCGTCATATCGGTGTCTTTCTTCTATCTTCGT 
AAATTGCTTATTTTTTAAGCAATGTATTTTTCTGTTTTCATTTCAATGCACAAACCCACT 
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TATTCACAGTGTGTTCACAACATTGGGCAGGCGGATTGTGTATTTTGGGGACAATTTTTT 
CAGACGGCATTCAAGGTTTTTTCCTGATTGCCGCCGCGCCTAAAAACCGCCTTTCGCGCT 
TAATCAAAAATACCGACAACGGAATATTGCCCAAAGCGACAATCAGATACAACAAGGAAA 
TGCTGTCAAACAAAAACAGCAACACCGCGCTCAAAACGGCAGCGGAAACCATAAAAATAC 
5 CGTTAACGATATTGTTGGCGGCAACGGCGCGGGCGCGGAAAGTCTCGCTACTGGCGGTTT 
GCAGCCAGGTATAGAGCGGAACGGAGAAAAATCCGCCGAAAAAGCCGATCAGCGTCATCA 
CCGCCATCACGGGATATGCCCATCCTTGCGATAAAAACCAAAAAATGCCGTTCAGCCCTT 
CAAAACGGTGTCCGTGCGTCAGCCACACCAAAACCAAGCCGCAAACCGTCAAACCCAACG 
CACCAACCGTTACCCAAGCCAACATCAGGCGTTCCCTGCTGAACTTGGCACACAGTACCG 
1 0 AACCGGCGGCAATACCGATGGAAAACAGAGCAAGCATCAGGTTGAAAACATTGTCGTTGC 
CGCCCAGATGGATTTGGGTAAAGGTCGGCAGTTGCGTGGTATAAACCGCGCCGACAAACC 
AAAACCACGAAATACCGATAATGGCGGTAAAAACGGGCTTGTGCCGCACCGTTTCACGCA 
GCAGGGATTTTGTGCCACGGACAATATTCCACTCAATTTGTGTATCGGCAGCCTTGGCGG 
GTACGGACGGCATAAACAGGCTGCCGACCGTGCCTCCGACGGCGACCAGCAAAACCAGTA 
1 5 TCCCGACAATATAAGGCGGTACACCTGCCACCGCCGTTCCCAAAATCTGACCGAACAGGA 
TGGCGACAAACGTACCCGATTCAATCAGGCTGTTGCCCATCATCAACTCTTTGTCGTCGA 
GATAATCGGGCAGGATGGCGTATTTCAGCGGCCCGAACAGCGTCGATTGCGCGCCCATGC 
AAAACAGACACGCCAAAAGCAGCGGGGCAGACCGGATATAAAACCCGTATGCCGCCACCG 
CCATAATGATCATTTCCAGCACCTTGACCCAACGCGCCAAAACGGCCTTGTCGAATTTGT 
20 TACCCAACTGCCCCGACAGCGAGGAAAACAGGAAATACGGCAAAATAJU^CAGCAACGCGC 
CCAAGTTCAACATCTGTCCGGCAGGCAGGAAGCCGTTTTGCCCCAAACCGTAAAACCCAA 
TCATCACAAACAGCGCGGTTTTGAACACATTGTCGTTGAACGCGCCGAGAAACTGCGTAG 
CGAAAAGAGGTGCGAAACGGCGGCTTTTAACCAGTCCCAAACCGCCTTTTTTAGCGTACA 
TCGTTTTCCCTCTCTTTTTCAATCAGTTTACTTGTCGAATCATCATCCATCAGGATGCGG 
25 TGCGCCGGCCCTTCCAAGTCGTCAAACTGCCCC-TTTTTGCCCGACCACCAAAAAAACCAG 
CCGATGACAAACGCCAAAATAATGCTGATGGGCACCAATATAAACATGCTTTCCATCACA 
TATTCCCTGTCAAATCGTTCAAAACAAAAGTCTGCCCCGACACGGTCAGATATTCGTTAC 
GCAAAGTTCCGACGGGAGCTTCGTCAAAAAACAGCTCGATACGGTCTTTGACCACGCGCC 
AATATTGGGGGATTTCCGTCTGACCGAACGGCGACAGGACATGATTTTCCATTCCGCCTT 
30 CAAGTTTGACGGCAAAACGCCCGCTTTGCGGCCGTGCTTCCGATTCGTCGTCGGCAAGCA 
GGATGAAAAAGCCTATATGCCGTCCCGATTGGTCATGAATACTGAAATAATGCATAAATT 
TCCCACCCGCCTTTTTTCAGACGACACCAACTAAAAACAGGGCGAATGTACCAGTTTGGA 
CGGGAAGAATGCAAAGAAATTCTCCCTCCCCCAGCCGAAAACACCGGCAAACCGCATATC 
CCCCTTTTTTCCGTCAAAATGCCTGACTTCCGCCATTTTCACGCAAACGCCCGATTAAGC 
35 CAAGCAATTGCAAAGATTTTTTGCTAGAATAGCCTGCTTCTTTTATCAACCTTTTCAGAC 
GGCCCCACTACTTTCCCGCCCAGGAAGGCAAAACGGATTCGGCACGAJiTCCGGTTAGTAT 
CCGTGTCCGATTCCAATGCCGTCTGAAACTTTCCGGAGTAAGAAAATGTCCCAAAAATTG 
ATCTTGGTTTTGAACTGCGGCAGCTCGTCCCTCAAAGGCGCGGTCCTGGATAACGGCAGC 
GGCGAAGTCCTGCTCAGCTGCCTTGCCGAAAAACTCAACCTGCCCGATGCCTACATCACA 
40 TTCAAAGTAAACGGCGAAAAACACAAAGTCGATCTGTCCGCACATCCCGACCACACCGGC 
GCGGTCGAAGCCCTGATGGAAGAACTCAAAGCCCACGGCCTCGACAGCCGCATCGGCGCC 
ATCGGCCACCGCGTCGTCAGCGGCGGCGAACTGTACAGCGAATCCATCCTCGTTGACGAC 
GAAGTCATTGCCGGCATCGAAAAATGCATCCCGCTCGCCCCCCTGCACAACCCCGCCCAC 
CTCTTGGGCCTGCGTGCCGCGCAAAGCATTTTCAAAGGCCTGCCCAACGTCGTCGTATTC 
45 GATACCTCCTTCCACCAAACCATGCCCGAAGTCGCCTACAAATACGCCGTTCCGCAGGAG 
TTGTATGAAAAATACGGCCTGCGCCGTTACGGCGCGCACGGTACCAGCTACCGCTTCGTC 
GCCGACGAAACCGCGCGCTTCCTCGGCAAAGACAAAAAAGACCTGCGTATGGTCATTGCC 
CACTTGGGCAACGGCGCGTCCATTACCGCCGTCGCCAACGGCGAATCGCGCGACACCAGT 
ATGGGCCTGACCCCGCTGGAAGGGCTGGTAATGGGTACGCGCAGCGGCGACATCGATCCT 
50 TCCGTATTCGGCTTCCTCGCCGAAAACGCCAATATGACCATCGCCCAAATCACTGAAATG 
CTGAACAAAAAATCCGGTCTGCTCGGCATTTCCGGCCTGTCCAACGACTGCCGCACCATT 
GAAGAAGAAGCCGCCAAGGGGCATAAAGGCGCGAAATTGGCCTTGGATATGTTTATCTAC 
CGCCTTGCCAAATACATCGGCAGTATGGCGGTTGCCGCAGGCGGTTTGGACGCACTGGTC 
TTTACCGGCGGCATCGGCGAAAACTCCGACATCATCCGCGAACGCGTGATCGGCTACTTG 
55 GGCTTCCTCGGTCTGAACATCGACCAAGAAGCCAACCTGAAAGCCCGCTTCGGCAACGCC 
GGCGTGATTACCACTGCCGACAGCAAAGCCGTTGCCGTGGTCATTCCGACCAACGAAGAG 
CTGATGATTGCCCACGACACTGCCCGTTTGAGCGGTCTGTAAGGTTTTATCCGCACACGA 
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ACTGCCTCCGGAAATGGAGGCAGTTTTTTTATCCGGCTTTCCATGCTTAAACAGCACTGC 
CTCTTTTCAGACATTGACGGTTGCAGCCGCTTACCTGAACCTTATAGTGGATTAAATTTA 
AATCAGTACGGCGTTGCCTCGCCTTGCCGTACTATCTGTACTGTCTGCGGCTTCGTCGCC 
TTGTCCTGATTTAAATTTAATCCACTATAATGATTAACTATTTTTTAATCATGTTATTAT 
5 TTTCCATAAAATACATGACATTAAGATGTTTTTCCACAAAAGATACACACACCGGCAAAC 
ACCGGCTGTGTTTATCTTTTCTTATGCCTATTTTTTAATCATCGTATTTTTATCTTTTAA 
TTTCAATACGCAAACTAACTTATACACACGGTTTTCACATCTTTAGACTGCTTCCGTGTG 
TATAGTGGATATTGCCGTTTTCCTTTCTGACAAAAATGCCGTCTGAGAACTTCAGACGGC 
ATTTGAAACATCGGAATCAGCGGTTTTGTTCATACCACTCGATAAACTTGTCTGCTTTGA 
1 0 CAAAACCCAGCAGCGGCTCGCTGCGGCTGCCGTCGGAGCGGACGACAAACACGCCCGGCG 
GCCCGAACAGACCGTATTCTTTCAACAACGCCTGATGTTCGGGCGTGTTGGCGGTTACGT 
CGATTTGGAAAAAGCGTTCCATATCGACTGCCTGATGCACTTCCGGCTGATTGAGCGTGT 
AAGCCGCCATTTCTTTGCAGGAAATGCACCAGTCGGCATAAAAATCCAAAACGACGGGTT 
TGTCGGGATGTTCTTTCAACGCCGTATCCATCGCTGCCTTCAGCGCGGCAGTATCGGCAA 
1 5 ACATTTTGCCGTGTTCCGAAGATTTGCCTGCTTCGGCTGGTGGATTGAGGGTCAGGAAAT 
GGTGCAGCGCGGTCGTTTTGCCGTTTGCGCCCTGCCAGCCGAACCACGCGCCGCCTATCA 
GCAATATACCGCCCAATGCGAATGCCACAGCTTTCGGACGGCGTTTCTGCCTGCGTCCGT 
TGACCAGCAGCATAAAGGCAGGAACCAGCATCAGCAGCGTGTACAGCGCGACGACGAGAT 
AATAGGGCAAGTGCGGCGTGGCGAGGTAAACGGCGACGGCTAGCAGGATGAAGCCGAATG 

20 CGTATTTGACGGCATTCATCCAATCGCCTGCCTTAGGCAGGATATGCCCGCCGAACGTGC 
CGATGGCAATCAGCGGAACGCCGGTGCCCAACGCCAAAGTGTAAAGTGCCAAACCGCCTA 
AAACCGCATCGCCCGTCTGACCGATGTAGCCCAAAGCAAATGCCAGCGGCGGGGCGACGC 
ACGGCCCGACAATCAGCGCGGACAATATGCCCATAATAAAGACGGAAACGATTTTACCGC 
CTGAAAGCCTGCTGCTTTGATTCTGAAAATACGACTGCACGGCGTTGGGAAGCTGGATGT 

25 TGAACAGCCCGAACATAGACAGTGCCAAGACGACCATTAAAGCCGATGCCGCCAATACCA 
CCCAAGCCTGCTGCAACCATACGGTCAGCAGTGCGCCCGTCAGTCCGGCAACAATGCCGA 
CCAGCGTATAAGTCAGAGCCAAACCCTGAACATAAACGACGGACAGCACAAACGCCCGCG 
CCTTGCCCGCCTTTTTGTCGCCGACCACAATACTGGAAACAATCGGCAACAGGGGATACA 
TACAGGCGGTAAAACTCAGGCCCAAACCAGCGAGAAAAAACGCCAAAAGATTGGCGTTGA 

30 GCGTATCCCAAGACAGCTTGAAACGGCTGTCGCCGCCCTCATCCCCCTTCGGGGGCGGCA 
GCGCCCCGCTGCCGTTTTGAGAGGAAGGCTGCAAAAAGCGGTCTTTGGCGGATGCCGGTT 
CGTCGGTTTGCGGATGGTAAGTGCCGTTGCCGAAAATATCAAACTCGGTATCCACGGGCG 
GATAGCACACGCCGGCTTCGGCACAGCCCTGATAGGTCAAAACCAATTTATACGGTTCGC 
CGACAGCCTTTGCATAAGGAAAGGCAACCTGCGCCTCGTGATGGTAAACCGTCTGCCTGC 

35 CGAAAAACTCGTCTTCCTTCTCTTCGCCCTTGCTGAAAGAAGGCTGTCCCAACAAATCCG 
CCGGATCGGTCTTGCCGACGATTTTCGCCTGATACATATAGTATCCGTCGGCAATCCTGA 
AACGGACGTTCACACCGTCGTCGGCAACGGCAAGCTCCGGCACGAATGCCTTTTCCGGCG 
GCAGCAGATCGTTCGCATCCAGCGCGAAAGCTCGTCCGCACAACATCAAAAATACGGCGA 
ACAGGCAAATCAGTTTTTTCATAATCGAATCCGTTTCAGACAAATAATTTGTCTGCATTA 

40 TAAATGGTAAGGTTGACGGTGGGATTTAATTTATGTAAAACCCGCCATTATCCGAACCTA 
TTTCCATAAACATCTTATCGAACCCGCCATGTACGATGTCAATACCCACGATGTCCGCCG 
CTTTTTCGCCCGCGTGTGGCAGCAGCGGCTCZ\ATCCGCTGCAACTGAGCGCACTGGAACA 
GAAAGCCCTCCGCATTGTCGAAGCCCATCCCGAATACCACCGTTATCTCGAACGCATCGA 
AGACCATCTGGACACCGACTGGCTGCCCGAAAACGGCGAAAGCAACCCCTTCCTGCATAT 

45 GTCGCTGCATCTGTCCGTCCAAGAACAGGCGGGCATAGACCAGCCGCACGGCATACGCGC 
AATCCACGACACCCTGTGCGCCAAACGCGGCTGGCTGGAAGCCGAACACGAAATGATGGA 
GGCACTGGCGGAAACACTGTGGACGGCGCAACGCTACGGCACCGGTTTGGATGTCAATTT 
CTACATGACCCGACTGCGCAAACTCATCGGCTTGGGTGCAGAAGATCAAGCCAGATTGAA 
CCCGCATGAAATCGCCTGACCATACCAACCGCCTGCAAAATGCCGTCTGAAGCGGAACAA 

50 CCCCTTTCAGACGGCATTCATTTTCCCCCAATCATTTCCACAACGCCTTTTTCAGCATAA 
TCAACCAATCCTTCTTATCCAAAACGGGGCGTTGTGCAAACACATCGTATCGGCACGCGT 
CCAGTTTCTGCAAAATCAACTGCGCCCCCAACACAATCATACGGAGTTCCAAACCGATAC 
GCCCATTCAGTTCCCTTGCCAAAGGCGAACCCGCCTTCAGCATACGGAACGCACGCCGAC 
ACTCATACGCCATCAGCCGCTGAAACGCCGCATCCGCCCGTCCTGCCGCGATCTGTTCCT 

55 CAGAAACACCGAATTTCAACAAATCGTCCTGCGGAATATAAACCCTGCCTTTTTGCCAAT 
CCACAGCCACATCCTGCCAAAAATTCACCAGTTGCAAAGCCGTACAGATGCCGTCGCTTT 
GCGCCACGCACACCGCATCCGTTTTCCCGTACAAAGCCAGCATAATGCGTCCGACAGGGT 
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TGGCGGAACGCCGACAATAATCGGCCAGCTCGCCGAAATTTCCATACCTTGTTTTAaCCA 
CATCCTGAGAAAATGCAGAAAGCAAATCATAAAACGGCTGCAAATCCAAACCGAACGGCA 
CAACCGCCTCGGCATCCAATCGTGCAATCAAAGGATGCGCCGACCGGCCGCCCGATGCCA 
ACACGTCCAACTCGCGCTGCAAACCCTCCAACCCCGCCAACCTGGCTTCAGACGGCATAC 
5 TGCCCTCGTCCGCCATATCGTCCGCCGTCCGTGCAAACGCGTACACCGCGTGAACCGGCT 
TCCTCAACCTGCGCGGCAAAATCAGCGAACCGACGGGAAAATTCTCATAATGCCCAACCG 
ACATACCTTCTCCATCCATCAAACAAAATGCCGTCTGAAACGGAACAAACCCTTTTCAGA 
CGGCATCAGATACCTCCAAGCTGCCGGCAATCAGTGGTGGTGATGACCGTGCGGGCCGTG 
GACATGACCGTGTGCGATTTCCTCATCGGATGCATCGCGCACGCTTTCAACTGTAGCCTT 
10 AAAGCGGATTTTCATGCCTGCCAAAGGATGGTTGCCGTCCACCACCGCCTTGCCGTCGGC 
AACATCGGTTACACGATAGACGACAACATCGCCGGTTTCAGGATCGTCGGCTTCAAACAT 
CATGCCGACTTCGACTTCAACAGGGAACACGCCCGCATCTTCGATACGGACCAACTCCGG 
ATCCTGCTCGCCGAACGCATCGTCGGGCGACAGCGCCACATCGACCGTATCGCCGGCATC 
CTTACCGTGCAACGCCTCTTCCACCAAAGGGAAAATGCCGTCGTAACCGCCGTGCAGATA 
1 5 CGCAATCGGTTCTTCGGTTTTGTCCAAAAGCTGATTGTTGGCATCATACATCTCATAATG 
CAGCGAAACCACGGAATTTTTCACGATAGCCATATTTGTCCTTTCAGGAACAGCAGATTA 
ATTACAGGCGCATTCTAACACAACCGCCGCGCCGGCCGATTACCGTT/iACCTGTTCATAA 
ACTGTACAGCACATATTTCAATGTAAATCTTTGTTATTTTATTGCGGTGTAACTTTTTTA 
CAACATTCTTAAAACCATTCCGACCTGTCTGCCGACTTTCCCAATCCGCCTTAATAAATC 
20 ATACAAGATACTGAAATTATATTAATCTCTATAATATTTATCCCTATCGAATTTTTAACA 
GCAAAACCGTTTTACAGGATTTATCAATCCGCCCGCCAGAAAACTTTTCATTCAAACCTT 
TTTCCCATCTGTACGACATTGCAATCCCTTATTCCATAGTGCATAATTACGCAAATTCAG 
CGATGAATTTCCAACCCGGTTTGTAGTATGGTCGATAAAGACCTATTTGTTTCAATAATT 
TAAATTGGTTCTAAAGGTTACTAAAATGAAAAAATCCCTGTTTGCCGCTGCTTTGTTGTC 
25 TTTGGTTCTGGCAGCCTGCGGCGGTGAAAAAGCCGCTGAAGCTCCCGCTGCTGAAGCACC 
TGCCGCCGAAGCTCCCGCTACTGAAGCACCTGCCGCCGAAGCTCCCGCTGCTGAAGCACC 
TGCCGCCGAAGCTCCTGCTGCTGAAGCTGCCGCTACCGAAGCACCTGCCGCTGAAGCTGC 
CGCTACCGAAGCACCTGCCGCTGAAGCTGCCGCTACCGAAGCACCTGCCGCTGAAGCTCC 
TGCTGCCGAAGCTGCAAAATAAGCATTTTCCGCTTGCAAAAAAGCAGGATACGTTCAGTA 
30 TCCTGCTTTTTTGATTTTTCAGACGGCATCAGATTCCCTTCCTCAATCTTCTCCCTACCC 
TTCCGACAAACATGCTTGACCTTCATACCGAATTTTCCCGACTCCTACCGGCAGATGAAA 
TTGCCGAACCTTCTCCGACGCTTTTAAAAGACCAGCGCAACCGCTTTACGTCTGCACCAG 
ACATCATTTTGCAGCCGCTCAGCGTTAAAAGCGTGCAAACCATTATGCGTTTCTGCCACC 
AACACCGTATTCCGGTTACGCCGCAAGGCGGCAATACTGGTTTGTGCGGCGCGGCAGTAT 
35 CGGAAAACGGCGTATTGCTGTU^CCTTTCCAAACTCAACCGCATCCGCAGCATCAATTTGT 
CAGACAACTGCATAACCGTCGAAGCAGGTTCCGTACTCCAAACCGTCCAACAGGCAGCCG 
AAGCCTCAAACAGGCTGTTCCCACTCAGTCTCGCCAGCGAAGGCTCGTGCCAAATCGGCG 
GCAACATCGCCTGCAATGCCGGAGGTTTGAACGTATTGCGTTACGGCACGATGCGCGACC 
TGGTTATCGGTTTGGAAGTCGTCCTCCCCAACGGCGAACTGGTTTCCCATCTCCATCCCC 
40 TGCATAAAAACACCACCGGCTACGACCTGCGCCATCTGTTTATCGGTAGCGAAGGTACAT 
TGGGCATTATCACTGCCGCCACGCTCAAGCTGTTTGCCAACCCCTTAGACAAAGCAACCG 
CATGGGTCGGCATACCCGACATCGAATCCGCCGTCCGCCTGCTGACCGAAACCCAAGCAC 
ACTTTGCCGAACGCCTATGCAGTTTTGAGCTGATCGGCCGTTTTGCCGCCGAATTGTCTT 
CCGAATTCAGCAAACTCCCCCTGCCGACACATTCAGAATGGCATATTTTACTTGAGTTGA 
45 CCGACTCATTACCCGACAGCAATCTTGATGATCGGCTTGTCGAATTTCTTTATAAAAAAG 
GCTTTACCGACAGCGTGTTGGCGCAAAGCGAACAAGAACGTATCCATATGTGGGCGTTGC 
GCGAAAACATCTCCGCATCGCAACGCAAACTGGGCACCAGCATCAAACACGATATTGCCG 
TTCCTATCGGGCGCGTTGCCGACTTTGTCCGCCGGTGCGCCAAAGATTTGGAACAGAATT 
TCAAAGGCATACAAATCGTCTGCTTCGGACATCTGGGCGACGGCAGCCTGCACTACAATA 
50 CTTTCCTGCCCGAAATCCTCAGCAATGAAGTCTATCGTTACGAAAACGACATCAACAGCA 
CAGTCTATCGCAACGTCCTTGCCTGCAACGGCACGATTGCCGCCGAACACGGCATAGGTA 
TCATCAAAAAACAGTGGCTGGACAAAGTACGCACGCCTGCCGAAATCGCCCTGATGAAAA 
GCATCAAACAACACCTTGATCCATATAACATTATGAATCCGGGCAAACTGCTTCCGTAAC 
CGGCATTTCTGATTTGCATACACAACAAAGAAAGGGACAATAGATCCGATTGTCGGTTTA 
55 GCGCGAGCTCGTGAGTGCGGTTAAAAATTGGTGGAAATTACACGAAAAATGACCGCACTT 
TTAAAATAAAAAAATCGGCAGTGAATTTCCCTGCCGATTTTATTTTGTTACAACTTAACT 
TAAAACGTCCACTGTAAATTCAACGCACCTTGTTTAGCTTGATGATGTTTGCCTGTTTGG 



wo 00/22430 PCT/US99/23573 



CGGTTGAATGTGGCTTGTAAGGTTAAGTGAGATTTGATTTTCACTGCTACACCTAATTGG 
CTCTCAATTGCCGTCTTATTGTTTATCACTCGACGCTCTCCGTCCATTTCCACACCGAAA 
GGTTTGTTGTGGTAAAGCGCGTTCACAGCGGCGAAAGGTTCAATAGCGATATTTTTATAG 
AGTGAAAATTGAGCTTTAGCTTG/y^CGCCAACCCGAGTTTGTAATTGGCGGGAGCCAAGT 
5 AAATTCACGTGGGCATTTTCGCTATCGCTGAATTTTCCGTTTACCCCCAAATAAGTCAAT 
TGTGCCTGTGGTTGTAGGTAAACACGAAGGCTGTTGCCCTTTTTAGTGAAGTGTTCCGCC 
AATAACGCATTGTAACCTGCTTCAATTGAGGCAGTAATACCTTTTGAAGTAAAACGTTCT 
GTACCATCTTCAGTGTTGATACGGTGGCGGAAGCGTTGATATTGCATCCAGCTATCCGCA 
TACGCACCTGTCTGTTTGTCCTGAAGTTGGTGCCAAGTGGCGTAAACGCCTGCACCAAAG 
10 CCTTTCACATTTCCCGTTGTAAGATTGTCTGTATCTGGGTTGTGGAAAGTGCTACGTTGT 
TCTGCTTGTCCGCCCATTAAGCCAATAGAAAGTTGATTACTTTCGTTTTGCCATGTGAAT 
ACTTCGCCGCCGAGTTGCACACCTTTACGATAGCCTTCTACAGGTGCTGTTTTGCCTTGC 
ACCCATTGGTTGGAATGTCCGTCAATCACACGCAACCACAAGCCTTTGCGTGGTAAAGTG 
CGGTCGAAAATATCGCTGTTTTTGTTGTTCAAACGCAAGGCGAATAAGGTATTGGCGGCT 
1 5 TGAGCCTGTTGTGCATAAATCGCCATATCATCGCGTTCTTGCACTTTGGTAAAAAAGCCC 
TCTGGGCGTTGTTGTAAAGAAAGCGTATAAATTCCCTTTTGGTGTTTGCCAGAAAGACGG 
AATGCGTGTTTATCTGCTGTGCCATTTACTTTGATAATTTGATGCCCATCGAGGCTTTTT 
AAATCGTCTATTGGATTTTCGAAGATGATGTCGGAAGTGCCAGTAACATTTTTCTCAAAA 
ATTAATGCAGTATTTTTCGCTTCTTTAGGATCGTAAGCAAAACGAAAACGAGCTCCGCCA 

20 GCATAATCTTCTTTTACGAGTAAACTTTCACTTTTAGTATTAAAACGGATGTCTGCATTC 
GTTGTTTTTAATTTCCCAACATTAGAATCCCAACGGGGCTCCCAGAGAGAATTTTCTAAG 
CGGAATTCATCCAAACTAATCGTTTGCCCGATAACGTGCGAGTTGTCTGTAACCTCAATA 
TAGTGGAATGGATCTAAACCAGAATATAGATGTGCTGCAAAAGAAACATAATTTTCAATA 
TGATGAATTACTTGATTAGCCCATTCTGTATAATTCCCGACAGATAAAATTTCGCTGTTG 

25 ATATGACTATTTTTTATTTTTGGACCTAAGGAGAATATATGACTTTTTACTATAAGAGGA 
TGGGATCCAAATTTTTCAGCTTGGCAAGTACTATAATCACGTATCTTAGTGTTAGAATTA 
AAACATTCCTTAAAATATTTCCGTATTTGTTCTTCTGTGTCCCCATTTCTTTTTGCAACC 
CCTAAACCTCGGGCGAAGCCAACTAGGTAACCTTCGGTATATTCTTGATCATAAAAAGAA 
ATCTTTTTTGAGTTATTGATGTTTTCGAATTGGTATGTTCTAGGGTATAGTGCGGGAAAG 

30 GGTGGAACTTTTGGATTATCCTCGGTTATAAGATAAGTTTCTTTTTTCCAATATTCACTC 
GTTTTATCGCGGAGTTTTTTTAAGCGGGTAATTTCATCATTAGTGAGCTTGGTTTTGTCG 
TAAACGTAATCAACAGCCAAAAGCGGAGAGGTATAAAGAATAGAAAAAAATAGACTTACA 
ATAAATGATTTTTTA7y\CTTCTGCTTGCTTGCTTGCTTGCTTCGAGTTTCATAATAAATT 
TTCCTTTGTCAAGTAAAAATAAATGGGGCGTGGATTTTAGCATAAAACTGAACAAAAAAT 

35 GTCATTTATCTCACATTTTTCTCTATTTATTTCTTGTTTATTAAAAGTAAACGTTTGCTT 
TTTGCTATTTTGTCAAGCCAGTTTGAAAATGTGTATAATTGCCCTCGTTATTTACAAAAA 
TTTCAGGAAAAATGACCGCACTTTACCCTTGGCTAATGCCAATTTATCATCAAATTGCTC 
AAACCTTTGACGAAAGCTTGGGGCATCATGCCGTGCTGATTAAAGCGGATGCTGGTTTAG 
GTGTAGAACGTTTACACATCAGGCGGCAGCCTTGCCCATACCGTCTGAAGCACTGTTTCC 

40 ACAATCAGCGCGTATGCTTAATCAACCGCTGTTTCTCGCGTTTCCAATCCGCCTCTTTCA 
TACTCTGGCGTTTGTCGTGCTGTTTCTTACCTTTTGCCAAACCGATTTCCATCTTGATTT 
TTCCGCGTGAAAAATGCAAATCCAGCGGCACGATGGTGTAGCCGGCACGTTCGGTTTTGC 
CGATTAATTTGTTGATTTCCGACTGGTTCAACAAGAGCTTGCGCGGACGTACGGCATCTG 
GTTTAATGTGTGTCGAGGCTGTGGGCAAAGCCGTAATATGGCAGCCGACCAGATAAAACG 

45 CGTCTTTTTTCCAATAGATATAACTCTCTTTAAGCTGTACGCGCGCGGCGCGGATTGCTT 
TGACTTCCCAGCCTTCCAAGACCAAACCGGCTTCAATCCGGTCTTCAATGAAAAAATCGT 
GAAATGCTTTTTTATTGTTCGCAATAGCCATAAACATCCTATCAATATCCGCCGTCAGAC 
GGCATAAACCCGAAAACAGAACCCATCATACCGCCTCTTCAACCGCCTGCACAATCTTCT 
CGGGATACAGCCTGTTGAGGCAGTCGGTATGCCCCAGCGGACATTCCCGCTTAAAACACG 

50 GCGAACATTCCAAGTGCAGGCTGACGATTTTCGCCCTATCGCTCAAAGGCGGCGTATGCG 
TCGGGCTGGAAGAACCGTAAACCGCCACCACCTTCCTGCCCAAAGCTGCCGCCAAATGCA 
TCAATCCGCTGTCGTTACACACGACCGTGTCCGCCAACGACAGCAAATCCATTGCCTGCG 
ACAAATCGGTTTTGCCGCACAAATTGACACACATACCGTCTGAAAGGCGGTTGATTTCCT 
CGGCAATTTCATCATCTTTTTGCGT^CCGAACAGCCAAACCTGCCAACCCGCCGCCAGAT 

55 AATGTTTGCCCAACTCGGCAAAATGCCTTGTCGGCCAACGCTTTGCCGGCCCGAATTCCG 
CACCCGGACAAAAAGCCAGAACAGGCTTTCCAATATCCAAGCCAAAGGTTTCGACAGAAA 
TTTCCCGCCGCCGTTCATCAATGGAAAACTCGGGGAATCCCGAATGCCCGTCAAAATCTT 
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CCTGACTCGGATGCGCGAGAGCCGTATATCGATCCACCATCAAAGGCAGACGTTCCTTAT 
CCAGCCTGCGTATATCGTTCAACAGAAAATAACGGCTTTCACCGACATAACCCGTCCTTT 
TACCGATACCTGTCGCCAGCGCGATGATTGCCGATTTCAAAGAACCGGGCAACACGATAA 
CCTGATCGTATCCGCGCCGCCCCAAATCCCTACCGACCCGCC/y^CGGCGTTTCAACTCCA 
5 ACGCACCATGTCCGAACGAATTCTCAAGAATTTCATTCACTTCCGGCATACGCTCGAACA 
CCGCCATCGACCACTTCGGTGCGAACACATCAATCGTGCAACCGGGGTGAAGTTCCTTCA 
AACGGCGGAACAAGGGCTGGGTCATCACGCAGTCGCCTATCCAACTGGGGGAAATAATCA 
GGATTTTGATGGACATAACAAGAAACCGAAATCAGACAGGCAGAATTTTACCGCGAAACC 
GTTGGAAAACCTATCTTGCCGCATTCCGAACGCCGGACGTGCAAATATGAAAAAGCCCGA 
1 0 ACATTCAAGTTCGGGCTTCAAAATTCTGGCTCCCCGACCTGGGCTCGAACCAGGGACCTG 
CGGATTAACAGTCCGTCGCTCTACCGACTGAGCTATCGGGGAATGGGGCGTATTATAGCG 
TC 



15 The following partial DNA sequence was identified in N. meningitidis <SEQ ID 3>: 
gnm_3 

GCGGGGGCtTCCATCGCAGTCATGCACAACATCTGCATCAAATGGTTTTGCACCATATCG 
CGCAACGCGCCGGTAATGTCGTAAAACTCACCGCGCTCTTCCACACCGAGCTGTTCGGCG 
ATGGTCAACTGCACGCTTTCGATATATTTATTGTTCCACAGCGGCTCGAACATTACATTG 

20 GCAAAACGCAGCGCAAGCAGGTTTTGCAGGCTTTCTTTGCCAAGGTAGTGGTCGATGCGG 
TAAATTTGCCCTTCTTTGAAATAACGCGCAACATCGGTATTGATTTGCTGGGAAGAAGCC 
AAATCCGTACCCAACGGTTTTTCCAAAACTACGCGCACATTGTCGGCATTCAAACCGATC 
GCAGCAAGGTTTTCACAGGCTTGCGCGAAGA^ziTTTGGGCGCGGTGGACAGATAGATGACG 
ACGTTGTCGGTTTCTTTGCGCGCTTTGACCAAATCGCCCAAAGCGGCAAAATCGTCCGGC 

25 TGCGTAACATCGACTTTGAGATATGCGAAACGTTCGACAAACGATGCCCAAGCCTCATCG 
GAAAAATTTTCTTTCACATGGATTTTGGAACTGGTTTCCACCTTCGCCAGAAAACCTTCG 
GTATCCAACTCGCTGCGGCTGACCCCCAAAATACGCCCTTCGGGATGAAGCAGACCGGCA 
ACATGCGCCTGGTACAGACAGGGCAACAGCTTGCGCATCGCCAAATCGCCGGTCGCACCG 
AACAACACCAAATCAAAATTTGTTTGTGTACTCATCGTATTATCTCGTCAGGAAAGAATT 

30 TTTCGATGCCGTCTGAAACCTGTTTCCCCCATCACGCTGCATCGCAATATCGGAAACAAA 
GGCAGGCGGCATAATGAGTAGTAATACTACACACCGCTACACTTTTTGTCTATTCCCATT 
TTTACAATTTATTTGACCTAGTCCAAAAATCGGGCAGGTTTCCCCTATTCCGTTACAACA 
ATCGAAAGATTCTGCGATTTAAATCAAATTTCTTTTCAATGCCTGATTTTTTTGTAACAA 
AATTACAAATTTTGTACTATAATAACACCCGCTTCCCACTTTCAGACGGCATACCTTTTA 

35 AAATATAGTGGATTAACAAAAATCAGGACAAGGCGACGAAGCCGCAGACAGTACAGATAA 
TACGGAACCGATTCACTTGGTGCTTCAGCACCTTAGAGAATCGTTCTCTTTGAGCTAAGG 
CGAGGCAACGCCGTACTGGTTTTTGTTAATCCACTATACTTACCGTCTGAATACCCGATA 
CAAAAATCAGAAACGCACAAACAAATCCCCAATACCCCCCCCGTTCCGACAGGAGACCGA 
CCGTGAACACTACTCCTATCCACTCCAAACTCGCCGAAATCACCGGGCGCATTATTGAAC 

40 GCAGCCGTCCGACGCGTGAAAAATATCTGGCGAAGATCCGCAGTGCCAAACAAATGGGAC 
GCTTAGAGCGCAACCAGCTCGGCTGCAGCAACTTGGCACACGGCTATGCTGCCATGCCTA 
AAAGTATCAAAATCGAAATGCTTCAGGAAACCGTCCCCAACTTAGGCATCATCACCGCCT 
ACAACGACATGGTTTCCGCACACCAGCCGTTTAAAGACTTCCCTGACCAAATCAAAGACG 
AAGCGCAGAAAAACGGCGCGACCGCCCAAGTCGCCGGCGGCACGCCCGCCATGTGCGACG 

45 GCATCACGCAAGGCTACGCCGGCATGGAATTGTCGCTGTTCTCCCGCGACGTGATTGCCA 
TGAGTACCGCCATCGGGCTGTCGCATCAAATGTTTGACGGCAGCCTGTTTATGGGCGTAT 
GCGACAAAATCGTTCCAGGTTTGATGATAGGCGCGCTTTCGTTCGGTCATATTCCGGGTA 
TCTTCGTCCCCGCAGGCCCGATGTCCAGCGGTATCGGCAACAAAGAAAAAGCCCGCACCC 
GCCAGCTTTTCGCCGAAGGCAAGGTCGGACGCAACGAACTTTTGAAAAGCGAAATGGGTT 

50 CTTACCACAGCCCGGGCACCTGCACTTTCTACGGCACGGCAA.z\CTCCAACCAAATGATGA 
TGGAAATGATGGGCGTGCACCTGCCTGCCGCCGCCTTCGTCCACCCTTACACCGACCTGC 
GCGAAGCGCTGACCCGCTACGCCGCCGGACACCTCGCGCGCGGCATCAAAAACGGCACGA 
TTAAACCTTTGGGCGAAATGTTGACCGAAAAATCCTTTATCAACGCCTTGATTGGCCTGA 
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TGGCAACCGGCGGTTCGACCAACCACACCATGCACCTCGTCGCTATGGCGCGTGCGGCCG 
GCGTGATTTTGAACTGGGACGACTTCGACG.AAATTTCCTCCATCATCCCGCTGCTCATCC 
GCGTTTATCCGAACGGCAAGGCCGACGTGA.!\TCACTTTACCGCAGCGGGCGGACTGCCTT 
TCGTTATCCGCGAATTGCTGAATGCAGGCCTGTTGCACGACGATGTCGATACCGTCGTCG 
5 GACACGGTATGCGCCACTACACCAAAGAGCCTTTCCTTATCGACGGCAAACTCGAATGGC 
GCGAAGCCCCCGAAACCAGCGGCAACGACGACATCCTGCGCAAAGCTGACAACCCGTTCT 
CCCCCGACGGCGGTCTGCGCCTGATGAAAGGCAACATCGGACGCGGCGTGATTAAAGTGT 
CCGCCGTGCGCGAAGGCTGCCGCATTATTGAAGCGCCTGCCATCGTGTTCAACGACCAAC 
GCGAAGTGTTGGCTGCGTTTGAACGCGGCGAGTTGGAACGCGATTTTGTGTGCGTCGTCC 
1 0 GCTACCAAGGCCCGCGTGCCAACGGTATGCCCGAATTGCACAAACTGACCCCGCCTTTGG 
GCATCCTGCAAGACCGCGGCTTCAAAGTGGCGCTGCTGACCGACGGCCGTATGTCCGGCG 
CGTCCGGCAAAGTTCCAGCCTCCATCCACATGACACCCGAAGCCCTGATGGGCGGCAACA 
TCGCCAAAATCCGTACCGGCGACCTGATCCGCTTCGACTCCGTTAGCGGCGAACTCAACG 
TCCTGATTAACGAAACCGAATGGAATGCCCGCGAAGTCGAAAGCATCGACTTGGGCGCGA 
1 5 ACCAACAAGGCTGCGGCCGCGAACTCTTCGCCAACTTCCGCAGCATGACCAGCAGCGCGG 
AAACCGGTGCCATGAGTTTCGGCGGCGAATTTGCCTGATGCGCGTTTCAGACGGCCTTTT 
CAGACCGAAGGCCGTCTGAAAAATTATTCAAGCGTTTTAAGATAGACGTAGGTTGGATTC 
TCGAATCCGACACAGCCGTCCAAGATGTCGGTTTCTTGAATCCGACCTACAACCTGTCCC 
ATCTTAATAAAATACCCCATTCCACCCGGAGAACCGAAATGTCCAAACTGACCCCCCGCG 
20 AAATTTTGACCGCCGGCGCAGTTGTGCCGGTAATGGCGATTGACGACTTAAGCACCGCCA 
TCGATTTGTCCCACGCCCTTGTCGAAGGCGGCATCCCTACCCTCGAAATCACCCTGCGCA 
CCCCTGTCGGCCTCGATGCCATCCGCCTGATTGCCAAAGAAGTGCCCAACGCCATCGTCG 
GCGCAGGTACGGTAACCAATCCCGAACAGCTCAAAGCCGTCGAAGACGCAGGCGCGGTTT 
TCGCCATCAGCCCGGGGCTGCATGAATCCCTCGCCAAAGCCGGCCACAACAGCGGCATCC 
25 CCCTGATTCCCGGTGTTGCCACCCCGGGCGAAATCCAACTGGCTTTGGAACACGGCATCG 
ACACCCTCAAACTCTTCCCCGCCGAAGTCGTCGGCGGCAAAGCCATGCTCAAAGCCCTGT 
ACGGCCCTTACGCCGATGTTCGCTTCTGCCCGACAGGCGGCATCAGCCTCGCCACCGCGC 
CCGAGTACTTGGCACTGCCCAACGTCCTGTGCGTCGGCGGCTCTTGGCTGACACCGAAAG 
AAGCCGTGAAAAACAAAGACTGGGACACCATCACCCGCCTCGCCAAAGAAGCGGCGGCGT 
30 TGAAACCCAAAGCCTGATTCGCATCGTAAAAATGCCGTCTGAAAAACCTTTCCCGTTTCA 
GACGGCATTTTGCCGATTGAGGGCACAGTCGGCATACACGGCAGCACTGATCAGACATAC 
CGCCCCTAAAATGCCCATCCGCCTTCCGCATAATAAAAATAACGTTCAGTTCATTCGACA 
GCAGCCGGACAGCCCATACTACGCGGCTGA7Uy\AATGCCGTCTG7\AACGCATTCAGACGG 
CATCCACTTAAAAAAAACAACTGATTCAACGCCGATTPATCCGCTTCCAAAACCACTTTC 
35 ATCACTTGGTTTTCGGCGGCGTGTTTGAACACGTCGTAGGCTTTTTCCAATTCACTGAAT 
TTGAAATGATGGGTCAGCATTTTGGTGTAATCGACGGAGCTGCTGGAAATCGCCTTCATC 
AGCATTTCGGTGGTATTGGCGTTTACCAGACCGGTAGTGATGGCAAGCTTTTTAATCCAG 
AGTTTTTCCAGTTTGAAATCAACGGATTGACCATGTACACCAACCACAGCGATATGGCCG 
CCGGGTTTCACAATGTCTTGGCACATATTCCATGTAGCAGGGATACCGACGGCTTCGATG 
40 GCGCAATCCACGCCGTCTTCGCCGACGATGGCAAAGACTTGTTTGGATACTTCGCCGGAA 
GCAGGGTTAATGGTATGGGTCGCACCCAATTCTTTCGCCAGTTTCAAACGGTTTTCGTCC 
ATATCGCAAACGATGATGGCGGCGGGACTGTACAGTTGGGCGGTCAACAGGGCGGACATA 
CCGACAGGGCCTGCCCCAGCGATGAATACGGTGTCGCCGGGTTTGACATCGCCGTATTGC 
ACGCCGATTTCGTGGGCGGTCGGCAAAGCGTCGCTCAACAACAGGGCGATTTCTTCGTTG 
45 ACATTATCGGGCAGCGGAACGAGGCTGTTGTCGGCATAAGGCGTACGGACGTATTCGGCC 
TGAGTACCGTCAATCATGTAACCCAAAATCCAACCGCCGTTACGGCAGTGTGAATAGAGT 
TGGGTTTTGCAGTTGTCGCAAGTGCAACATTTGCTGACGCATGAAATAATGACTTTATCG 
CCGACTTTGATGTTTTTTACAGCCTCGCCGACTTCTTCTACAATACCGATGCCCTCATGA 
CCGAGAATACGGCCGTCGGCAACTTCGGGGTTTTTGCCTTTCCAAATACCCAAATCGGTA 
50 CCGCAAATCGTGGTTTTGACGATTTTCACCACCGCATCGGTCGGATCGATAATCTGCGGA 
CGGGGTTTTTCTTC/y^CGGATGTCGTTTGCGCCGTGATAAACCATTGCTTTCATGCTG 
ATACTCCTTGCTTGTTGATAAATAATTTCAATACCGCAATAAAGTTTCTTTATATGAGTT 
ATATGCCCCTACAAAAAATAAGTCAATAAGAATTATTTTCACAATGTTATACAATAACAT 
ACCGTTTTAAATATAAATAAAACCACCGATTGATATTAATGAACACACCCATCCCCTTCT 
55 CCGAACGGCTCATCCGCTGGCAAAAACAACACGGTCGCCACCACCTCCCTTGGCAGGTCA 
AAAACCCTTATTGCGTCTGGCTTTCCGAAATCATGCTCCAGCAAACGCAAGTCGCCACCG 
TGTTGGACTACTATCCGCGCTTCTTAGAAAAATTCCCGACCGTTCAGACGCTTGCCGCCG 
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CGCCGCAAGACGAAGTGTTGTCC-rTGTGGGCGGGCTTGGGCTATTACAGCCGCGCGCGCA 
ACCTGCACAAAGCCGCGCAACAA3TCGTCAGGCAATTCGGCGGCACGTTTCCGTCGGAGC 
GCAAAGACTTGGAAACCCTCTGC3GCGTAGGCAGAAGCACCGCCGCCGCCATTTGCGCCT 
TCTCCTTCAACCGCCGCGAAACCATTTTGGACGGCAACGTCAAACGCGTACTCTGCCGCG 
5 TGTTCGCCCGCGACGGCAATCCGCAGGACAAAAAATTTGAAAACTCGCTCTGGACACTTG 
CCGAAAGCCTGCTGCCGTCTGAAAACGCCGATATGCCTGCCTATACACAAGGTTTGATGG 
ATTTGGGCGCGACCGTGTGCAAAC3GACGAAACCCTTGTGCCACCAATGCCCGATGGCGG 
ACATCTGCGAAGCGAAAAAGCAAAACCGCACCGCCGAGCTGCCGCGCAAAAAAACCGCCG 
CCGAAGTACCGACCCTGCCGCTTrACTGGCTGATTGTCCGCAACCGGGACGGCGCGATTT 

1 0 TGCTGGAAAAACGCCCCGCCAAAGGCATTTGGGGCGGGCTGTATTGCGTGCCGTGTTTTG 
AAAGTTTGAACGGGCTTTCCGAC7TTGCCGCCAAATTCTCCCTGACCATGGCAGATATGG 
ACGAACAAACCGCCCTGACCCACCGCCTGACGCACCGGCTGCTATTGATTACGCCCTTTG 
AAGCACAAATGCCGTCTGAAAGCCCTTCAGACGGCATTTGGATAAAGCCGGCGCATTTGA 
AAGATTACGGTTTGCCCAAGCCTTTGGAAATTTATTTAAACGGTAATAGGTTAGAATAAA 

1 5 CAAAATAAACCCATTGAACTGTT3TTTGCAGGTATCGCAGCAAGAACAACCGATGAATTT 
GGGTCGTATTTTAGGCGGCGGGATAATGTTCAAATGGGACATTTGGAACGGAAGAAGTCG 
G CAAT T T^AAAAGGAT T TAAAAAG C7VAAGAAGGT CAAAAACAT GAACACAAACT T AAAT G 
ACAAAGACAAAGCCATGGATACCGCAATCAGGTTTCAGAAAAGGATGAGGATTCCGAAAT 
TTTTCTTTTTAATTCTCGGAATCACAATGGTTTTGGCATTTATCCAAGACGTGATAACGG 

20 GTTCTAATTTTCTGCAAATAACAATTAATGTAAAATTTTCGTAAAAATTTATCGGCTTTT 
AAAACAAAATTGACTAAAATAGTCGCGAGTTTTTACTGCAATAAAGGAGATTGCAATGAA 
TATGAAAACCTTATTAGCACTAGCGGTTAGTGCAGTATGTTCAGTTGGTGTTGCGCAAGC 
ACACGAGCATAATACGATACCTAAAGGTGCTTCTATTGAAGTGAAAGTGCAACAACTTGA 
TCCAGTAAACGGTAACAAAGATGTGGGTACAGTGACTATTACTGAATCTAACTATGGTCT 

25 TGTGTTTACCCCTGATTTACAAGGATTAAGCGAAGGCTTACATGGTTTCCACATCCATGA 
AAACCCAAGCTGTGAGCCAAAAGPAAAAGAAGGTAAATTGACAGCTGGTTTAGGCGCAGG 
CGGTCACTGGGATCCTAAAGGTGCAAAACAACATGGTTACCCATGGCAAGATGATGCACA 
CTTAGGTGATTTACCTGCATTAACTGTATTGCATGATGGCACAGCAACAAATCCTGTTTT 
AGCACCACGTCTTAAACATTTAGATGATGTTCGCGGTCACTCTATTATGATCCACACGGG 

30 TGGTGATAATCACTCCGATCATCCAGCTCCACTTGGCGGTGGCGGCCCACGTATGGCATG 
TGGCGTGATTAAATAATTCGATTGTTCGAAACGAAAAGTGCGGTGAATTTTGACCGCACT 
TTTTTGCTAGATATTTAGCATTGAGACCTTTGCAATAACATAGGTTACTAAAATTTTATG 
CTCAATCTCATTTTCAAAATGCPAAACTTTTCTGATTTTTCCTACTTTTTGCTCAATATT 
AGGAAGGTTTTAGGCAATTGAAA^TTTTTTGGCGCATTTTTATGCGTCAAATTTCGTTAA 

35 CAGACTATTTTTGCAAAGGTTTCAATTCATAAGTTTCCCGAAATTCCAACATAACCGAAA 
CCTGACAATAACCGTAGCAACTGAACCGTCATTCCCGCGAAAGCGGGAATCTAGACCTTA 
GAACAACAGCAATATTCAAAGATTATCTGAAAGTCCGAGATTCTAGATTCCCGCTTTCGC 
GGGAATGACGAAAAGAGACCTTTGCAAAATTCCTTTTCCCCGACAGCCGAAACCCCAACA 
CAGGTTTTCGGCTGTTTTCGCCCCAAATACCGCCTAATTCTACCCAAATATCCCCTTAAT 

40 CCTCCCCGGATACCCGATAATCA3GCATCCGTGCTGCCTTTTAGGCGGCAGCGGGCGCAC 
TTAGCCTGTTGGCGGCTTTCAACAGGTTCAAACACATCGCGTTCAGGTGGCTTTGCGCAC 
TCACTTTAACCAGTCCGAAATAGGCTGCCCGGGCGTAGCGGAATTTACGGTGCAGCGTAC 
CGAAGCTCTGTTCAACCACATAACGGGTCTTCGACAAATATCGGTTGCGTTTGGTTTGCA 
CTTCCGTCAGCGGACGGTTGCGGTGGGCTTTGCGCATAATGCCGTCCAGCAACTGATGTT 

45 CTTCCAGATGTTGCCGGTTTTCCGCACTGTCGTAGCCTTTGTCGGCATAGACGGTCGTAC 
CTTTGGGCAGTCCTTCCAACAACGGCGACAGGTGTTTGCACTCATGGGCATTGGCGGGGG 
TAATGTGCAGTTTCTCGATATAGCCTTCTGCATCGGTACGGGTATGTTGTTTGTAACCGA 
GTTTGTAGAGGCCGTTTTTCTTTATCCAACGGGCATCGCTGTCCTTACTCGGTGTGGTTT 
GACCGCTGATTTGTCCTTCTTCGTCAACTTCTATGGCCTGACGCTGTTTGCTGCCGGCGG 

50 TCTGAATAATGGTGGCGTCAACGACGGCAGCGGATGCTTTCTCTATTTTTAAACCTTTTT 
CGGTCAGTTGGCGGTTAATCAGTTCCAACAGTTCAGACAGGGTATTGTCTTGCGCCAGCC 
GGTTGCGGTAGCGGCATAAGGTGCTGTAATCGGGGATGCTCAGTTCGTCAAAACGGCAAA 
ACAGGTTGAAATCGATGCGGGTAATGAGGCTGTGTTCGAGTTCGGGATCGGAGAGGCTGT 
GCCATTGTCCGAGCAGGACGGCTTTGAACATGGACAGCAGGGGATAGGCAGGACGGCCGC 

55 GGTGGTCTCTAAGGTAACGGGTTTTTTGACGGTTCAGGTATTGTTCGATCAGCTGCCAAT 
CAATCACCCGGTCCAACTTCAATAGCGGGAAGCGGTCGATGTGTTTGGCAATCATGGCTT 
GGGCGGTTTGCTGGAAGAAGGTGCTCTTGAGAAATCCCCTAAATGTCTTGGTGGGAATTT 
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AGGGGATTTTGGGGAATTTTGCAAAGGTCTCTAGATGAGTGAAAAAGAAGTGCAGGCTGC 
CTAAAAAGACAGAAAAAGTCTTTCCGGCAGCCTGCACTTTGGTTTCATTTCAGTCAGTAA 
ACCCAGTAAACGACGGTCTGAAAACGCAGAACGTTACGAAAAAAGCAGCCTACACGCCCA 
TCCCCCGCCTTCTACCCGTTCTGTAAATCATACAGATAGCGGTAATA7CCGTTCGGCTTC 
5 GCCAGCAATTCCTGCTGTGTTCCCGCTTCCACAATCCTGCCTTTATCCATGGCAATGATC 
CGGTGTGCCGTTTTAACAGTGGACAGACGGTGGGCGATAATCAGCACCGTCCGGTTGGCG 
CAAATGGCCTGCATGTTCTGCATAATCGCTCGTTCACTTTCATAATCCAGCGCGCTGGTG 
GCTTCATCAAAAATCAGAATGCGCGGATTGGTGATTAACGCGCGGGCAATCGCAATACGC 
TGCCGCTGTCCGCCCGACAAGCCGGCCCCTTGTTCGCCCACCACGGTGCCGTAGCCTTCC 
10 GGCAGCTCCATAATAAACTCGTGTGCGCCCGCCAGTTTGGCTGCTTCGATAATGCGTTCC 
AGCGGCATACCCGTATCCGTCAGCGCGATATTGTCGCGTATGCTGCGGTTGAGCAGCACA 
TTCTCCTGCAAGACCACGCCGACCTGCCGCCGCAGCCAGGCAGGAGCGGCCT^GCCAAA 
TCGTTGCCGTCCACCAACACCCGTCCCTGCTCCGGTACATACAGACGCTGCACCAATTTG 
GTGAGTGTGGATTTGCCCGACCCCGAACGTCCCACAATCCCCAGCACTTCCCCCGCCCGA 
1 5 ATCCGCAGGTTCAAATCCTGCAAAATCAGCCTGCCGTCCGCCTTATAGCGGAAATCGACA 
TGTTCGAACGTAATCTCCCCCCGGATATCGGGCAAAGCCAAATGCGAAGACGCATTCTCG 
GTCGGCGCATTCAGAATATCCCCCAAACGCGCCRCCGAAATCCCCACCTGCTGGAAATCC 
TGCCACAACTGCGCCAAACGGATAACAGGCGCCGCCACCTGTCCCGAGAGCATATTAAAC 
GCAATCAGCTGCCCCACCGTCAGCTTGCTCTCAATTACCAGCCGTGCGCCAATCCACAAC 
20 GTCGCCACCGTCACCAGCTTCTGAATCAGCTGCACCCCCTGCTGGCCGACCACCGCCAAC 
TTCGTTACCCGAAATCCCGAAGCCACATAAGCCGCCAACTGATTGTCCCAACGCTGCGTC 
ATCTGCGGCTCCACCGCCATCGCCTTTACCGTACCCACCGCAGTGATGCTTTCTACTAAA 
AACGACTGGTTGTCTGCATTGCGCGCGAACTTATCGTTCAGACGCGTCCGCAGTATCGGA 
CTGATAAATGCCGACCAAAACGCATAGGCAGGCAACGAAGCCAATACCACCCAAGTCAGA 
25 GTGGAGCTGTAATACCACATCACCGCCAGAAAGATAAACGAAAACGCCAAATCCAACACC 
GAAGTCAGCGCCTGACCGGTCAAGAAATTGCGAATCTGCTCCAATTCCCGCACCCGAGCC 
ACCGTATCACCCACTCGTCTGTGCTCGAAATAGGATAAAGGCAGGGAAAGCAGATGCCGG 
AACAAACGCGCGCCCAATTCCACATCAATACGTGAAGTCGTATGTGCAAACAGATACGTC 
CGCAAACCGCCCAACACAATCTCAAACAGCGACACCACCAACAAAGCCACCGACACCACA 
30 TCCAAAGTAGAGAATCCCCGATGTACCAGCACCTTGTCCATCACCACTTGGAAAAACAGA 
GGCGTAATCAGCGCAAACAGCTGC7\ACACCACCGACACCACCAATACTTCAAAAAACAAC 
CGGCGGTATTTGATTACCGCCGGAATAAACCAGGTAAAGTCAAACTTTGCCAAACTGCCC 
AATACCGAAGCGCGGGAAGC7U\CCAATATCAGTTTGCCCGA^\TATCTGTTAGAAAATTCG 
GCAAAAGACAATACCGCAGACTTATTCGTAACCAAATCCTGTATCAAAAATTGGGCATGC 
35 TCACCCTCACCGTCTGTTTTGGCCAAAATGAAATGGTTGCCGTCATCACACCATACCAAT 
GCGGGTAAAGTCGCCATAGCCAAACGTTTAATAGGCTGGCGGACTACCTTTGCCTTCAAT 
CCCAAAGATTTGGCGGCTAACAGCCATTGCGTTTCATTTAAATCGCTCTGTGCGGAAGTA 
CAAAATTCATGCTGTATATCGGCAGGATTGGCGGCAATGCCGTGGTAATGGGCGAGGATG 
ATGAGGGCGGAAAGGGCGGGGAGCGGTGCGGATACGATAGACATAAATAAAATATAGTTA 
40 GATTGGATGTGGATAACGGCTGGCTGGAAAAGGAATATATTAAGTAGAAAGAAATATATA 
AATAAAACAGCAGAACGCATTGTAAGGATATATATGGGAATTGTAAAGAGAAAGTATGGA 
AAAGTTCTCGTTTCAGGAAGGTAAAACGGCTTAGGAATCGAGTTAGATGAGGATGCCTCG 
CACCTCTCGTGCCTCCTGCATACCGTTAAGGCACAGGGTTAAGGTGCAGGCTGCTCCGAA 
CTCTGTTGCGGTCGGGTAATGTTATTTTTTGTGTTTCAGGCAGCCTGAAATATCTGTATA 
45 TTTTTGTTTTAAATAGATTTTAAAGATTGATAACTGTTCTTGACGATTTTTCAAGAAAGG 
AGTAAATTTCAAGAAAGGAGTAAAGTGACTTATTATCAATGACAAGCAACGCGCGAAGTG 
ACAAGGAAAACTATCTACTTAAATTCTAAGGAGGCTTCGAATATCATAAACCAATCAGAA 
ACATAGAGATAAAAATTATGTACAAATATAATCCTCTTATACAATTTATTGCACAGTTGA 
TTATGTCTTATGGAGCAAGCGTAGGGTGGGCACTTGCTGCCCCACGCGTTTCATATTTCA 
50 AGGCAGCCTGAAACCGTGTGGGCATAAATGCCTACCCTACATCCCAAAAAACAAGCGCAG 
CCTGCGTGTGTAGGGTGCGAACTTTCGGCAGGTAGACACGCAGTTTTATATTTTCAAGCT 
GAGGGATGCTTAAGAAAAGTACAAAACATTAAAAAATAAGGGGCTGTACTAGATTAGCCC 
TAAATCCACACCAATCCCGCAAGATTTTTAGCTGTCGGGACGGTGTGCCGAAGTTAAATC 
GAAATTCGCATTCTTTCAAGAACAGCGGGAAAGATTTGCGATCAATTCCGTTCTATTTGC 
55 GCAAGACGCGTTTTGCCTGATTCCAAAAGTTCTCAATGCCGTTTATGTGGTTCTGACGGT 
CAGCAAATTCCTTGGAATGGTTGATGCGGTAATGGATAAAACCGCTTACGTCCAACTTGT 
CGTAACTGCTCAGGCTGTCGGTATAAACAATGCTGTCCGGCATGATTTTCTGTTTGATAA 
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CAGGCATTAAAGTATCGGACTTGGCATTATCTACGACAACGGTATAGACCCGTCCGTTAC 
GTTTCAGAATGCCGAAGACAACCACTTTTCCTGCCGCACCGCGACCACGTTTGCCTTTAC 
GCCGTCCGCCGAAATAGCTTTCGTCCAACTCGACAGAGCCCTCGAAAACCTCATTGGCAG 
CCAAGGCCAGATAATGGCTGATGACCATACGGATTTTGCGGTAGAACAGGACTGCCGAAT 
5 TGGGATGGATACCCAAAATATCGGCAGCAGAACGGGCGGTAACTTCGAGTACAAAAAAAC 
GGAGCAGCTCTTTCTGTACTTTTTTCTTTAATTTGCAGTGTGTTATCTTCATATTTCGGG 
GGTAACATATCTGCTAATCTAGTACAGCCTCAAACAAAAAAGAGAAATTTTAATTTCGCT 
AAATCGCATAAAGATTAATCAAGAGTATCATTAAATGATATGAGTGAGCATCTATAATGC 
CAAGAAGAGTTGTTAAGACATAACGATTATTGAAATAGATTGTAAAATAGATACTTAGAT 
1 0 AGTCTGAAAAACGGATTTGTGAAACTTTTTATTACGCGCCATCATTTGAAAATGAAACTT 
AAAAAACACTTATCATAATAAATATTTTCTTTACGTTGTTTGCTAATAAACTCAGTGCAA 
TATCAGCGCAATATTTTATGGAAATTTTATGGATAACAAAAAAGAATTTATTAATAATTT 
AACAAATAGGTATATGTGGATCTATCCATTGGTC.TTAAATATTCTATTTCTACCTTTTTA 
CCAGTCCTACCAATCTTTTTTTATTGCGCTTGGTTGTTTGTTTGCACTGGTTAGAAAAAT 
1 5 GCAAAGCTTAGATTTTAAATTACAAAATCATATTGTATTGTTAAATATAAAAAGTGCTTG 
GGCAGATAAAAAAGTATTTTTGATTAGGATAGTAGTGTCATGGTTGGCAGTAATGGAAAT 
ATGGATGTGTTTTATTTCGGAATCATCAACGTGGGTATGCGGTGCTTTTTGTTTAAATAG 
TGAAATATTGGAAAAAATTTTTCGTGGCTTTGGTTATTCTGGTAGTTTATATTTTTTATT 
TATATTGATGATTGATCTCAATAAGTTAAGAGAGAGTATTTGAATTTCATTTTTTGTTTG 
20 ACTTAAACTCAAGGAGAGTAACAATGATTGGTAGTGGTGATACTAAACAATGCAAAAAAT 
TTTCTGCGTGTGATGGAAAATACCACGTCTACGATCCCCTCGCCCTAGACTTGGACGGCG 
ACGGCATAGAAACAGTCACCGCCAAAGGCTTTTCAGGCAGCCTGAAGACTGAGAGAGTGA 
ATACGATGAGTATACACTCTATGCCACTAAATTGATATTCACTAAATCATACCAGCTATA 
TTTTATTTAATGAGACATATGAAAAATAAAAATTATTTACTAGTATTTATAGTTTTACAT 
25 ATAGCCTTGATAGTAATTAATATAGTGTTTGGTTATTTTGTTTTTCTATTTGATTTTTTT 
GCGTTTTTGTTTTTTGCAAACGTCTTTCTTGCTGTAAATTTATTATTTTTAGAAAAAAAC 
ATAAAAAACAAATTATTGTTTTTATTGCCGATTTCTATTATTATATGGATGGTAATTCAT 
ATTAGTATGATAAATATAAAATTTTATAAATTTGAGCATCAAATAAAGGAACAAAATATA 
TCCTCGATTACTGGGGTGATAAAACCACATGATAGTTATAATTATGTTTATGACTCAAAT 
30 GGATATGCTAAATTAAAAGATAATCATAGATATGGTAGGGTAATTAGAGAAACACCTTAT 
ATTGATGTAGTTGCATCTGATGTTAAAAATAAATCCATAAGATTAAGCTTGGTTTGTGGT 
ATTCATTCATATGCTCCATGTGCCAATTTTATAAAATTTGCAAAAAAACCTGTTAAAATT 
TATTTTTATAATCAACCTCAAGGAGATTTTATAGATAATGTAATATTTGAAATTAATGAT 
GGAAACAAAAGTTTGTACTTGTTAGATAAGTATAAAACATTTTTTCTTATTGAAAACAGT 
35 GTTTGTATCGTATTAATTATTTTATATTTAAAATTTAATTTGCTTTTATATAGGACTTAC 
TTCAATGAGTTGGAATAGTTTTGGT.AATTTTATGAGCGCACGCTCATCCGCGTTAGCAGA 
ATTTGGAAATATGGTTGCTAATTTAGTTTCTGCAAAAAATGAGAAAGATATCTCGAAACG 
TAATGAATATTACAAACAAGCTGGTTATAGTGCATTATTAGCATTTGGTAATTTGGCTAG 
TAATATTGCACCAGGTAGTACGTCATCGCATATTGTAAACGGAACAAATGCCTCTGTGAT 
40 TGCAAGCCGTCTCTCTGGAAATATATCTTCAGCTATTCAGGAGCATAAAGATGGTAAAGT 
TAATATCAACCGTTTTCAAAATATTTTAGCGGATTTATATTCATTGGGAGGGTTAGGAAG 
TACATTAATAGAGAAGAATGGAAATATGCAGAGTTGGGGGATTCCATTAGCAATTGCTGG 
AGATATAATTGCAGCAACGGCTATTGCCACAGGAGATACTGGTACGATATCTACAGAGGA 
ATTTTATAATTTTGACAACTGGAAAGGTTTTGGGTATGAGCTATTTGAAGACTGGTCTCG 
45 TTGGGTATACGACTGGCTGCCCGACGGCTGGAATCTGTGGAAAGAATTGGACAGAAACCG 
TTCAGGCCAATACCACATCTACGACCCCCTCGCCCTAGACCTAGACGGCGACGGCATAGA 
AACAGTCGCCGCCAAAGGCTTTTCAGGCAGCCTCTTCGACCATAACGGCAACGGCATCCG 
CACCGCCACTGGCTGGGTTTCTGCCGATGACGGTTTACTCGTCCGCGATTTGAACGGCAA 
CGGCATCATCGACAACGGCGCGGAACTCTTCGGCGACAACACCAAACTGGCAGACGGTTC 
50 TTTTGCCAAACACGGCTATGCAGCTTTGGCCGAATTGGATTCAAACGGCGACAACATCAT 
CAACGCGGCAGACGCCGCATTCCAATCCCTGCGTGTATGGCAGGATCTCAACCAGGACGG 
CATTTCCCAAGCTAATGAATTGCGTACCCTTGAAGAATTGGGTATCCAATCTTTGGATCT 
CGCCTATAAAGATGTAAATAAAAATCTCGGTAACGGTAACACTTTGGCTCAGCAAGGCAG 
CTATACCAAAACAGACGGTACAACCGCAAAAATGGGGGATTTACTTTTAGCAGCCGACAA 
55 TCTGCACAGCCGCTTCACGAACAAAATGCTATCCATTAGCCATGTTCGGGAAAACACGAT 
TTCCCCGTTTGTTTTAGGCTGTCTAAACAAATAACCATAAATGTATATCATTATTTAAAA 
TAAATAAAAGTATTTAACTATTATTGACGAAATTTTAGAGAAAGAGTAGACTGTCGATTA 
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AATGACAAACAATAGTGAGAAAGGAAATATTTACTATCCGAGCACAGAGCATATTTTAGG 
TAGCCTGTAACTGTTCCTGCTGGCGGAAGAGGATGAAGGTTGACTTACCCGAGAATAAAT 
GTCCTGTTGTGTGATATGGATGCCATGCCGCGAAGCAATTGATGCAATCACGGCAGTCCT 
ACTTGAATGAAACCTGTCGTTGCAGAATTTGAAAACGCTATTTTTAAGAAAGGATAAAGG 
5 GAGAAAGAATTTTTGGTTTTTAAGCTGCATGAAACCGTGTTGGAATAAATGCACACCTAC 
GATAATTAATAATTTTCGTTTTTTATTCTACAAGCTATTTATATATGATTGCTAAAAGTT 
TATTTTTTAGATGCCAAAAAATATATTTTATATACTTCATATTGTTTATATGTCTTTATT 
TGAATATATCTTACGATGGGGAAATATTTATATATTTTATAATAAATTTTACTCATTTGC 
TAATATGTCATGGAATATTACTTGTATTTTGTAGAATTTTTCCATATGAAAATATTCCAT 
10 TTACTATTTTTCTGAACTTTATTAGTTTATTTTTAATATTTTTACCTCTTATATTTACCA 
TAAGAGAGCTAATTGATTCATATTATATTGAGTCGATAATTAATTTATTCTTAATTTTAA 
TTCCTCACGTTATTTTTTTAATTTACTTGAAAGGAAAGCAGATATGACATCTGCAAATTT 
TAATATTAACGGTTTTGGAGATGTGAAATTAACACCCTATTCACCACTCTTGGGATATAA 
AGCTTGGGATTCATTTATTGGTTCTATTCAATCCTTATCTGATTTAATCTATAATGTGGA 
1 5 TAACAATAGAAATAAAATGGAAATTACTGTTAATAATGCTATTCAAGCTGCAGATAGCTT 
TTTAAGCAGTAATTGGAAGAGATAACAAAATAACTWUVTAACAAAATAACAAATACTGCT 
TCTTTACTTGCATCCTTCGATAACATTTTTTAAATTTAAGAAATGTATCTCGAGATATAC 
GAGAAACAGGAAAATTTAAACCTAATGATATTCAACAAGCAATTGGTGATATATTCATTG 
CTGCTGGTGATGGATTACAATATATAAAACAACAAACAGAGGCGATGGCTCAAAGCAAAT 
20 TCTTACCAACTAAATTAAAAACTGGTTTAAATGATGTCCTTAATTCTAGAATGCTAAAAT 
CCTCTACTGTTTTACAGCATGAATTGAATTAAATAAGGATTATGGAAACGAGAGGCTTGG 
CGAATCTATAATGAATATAGATGATTTTACACCAAGTAAGATAGCAAACTTTTTTGCGGA 
TCCTGATACATACAGCAATGTATTAGAAGAAGTATCTAGGTTTATATATTCCTTAGTTCC 
TGATGATGCAAACCCTTGGAAAGGGGGCGAAGATTATATTGGACGAGGGATAAGTGAATG 
25 GGGAGAGTTACTGGAAAAATGGTATAAACAAGATTTTCTCCCTTATCTTGAAAAGAATGG 
GACCAATTTCCGAAATTTGAAGATTGGCTGCCTGAATTCCCTGAATGGGCAAGAGAGTGG 
TTGAAATTAGCTCTCAAACGTTCAGGCAAATATAACGTTTACGATCCCCTCGCCCTAGAT 
TTGGACGGCGACGGTATAGAAACCGTTGCCACCAAAGGCTTTTCAGGCAGCTTATTTGAT 
CACACCAACAACGGCATCCGCACCGCCACGGGCTGGATTGCTGCATATGACGGTTTTCCT 
30 GTGCGCAAATTAAACAGTAACGGGGGCATTATTAGCACGACAGATACCATATTCCAATCT 
TTGCATACATGGCTTGATCATCAACCAAGATGATATTTCCCAAGCACAGCATGATGCATG 
CCATTGAAAAATATAGAAAATTAATTGAAAGCTTAAATGGATATTGAAATGAATGATCAC 
ATAGTACAAATTATAAGAAGGTTCGGGCTAGGTAGGATATTTTTTTATTCGTATAGCAAA 
TCATCTATAATAATTTTTTCTTCGTATGTTGTTTATTATATAATTTACAATTATCAATTT 
35 AATTACCTTTCGCTTTTAATTTATTTATTACCAATATTGTGCAGTATATATATGTTTATA 
TTTTTTTTAGGGAAAACTAAGGATACATTAACGACAGAGCGAAGAAAAAAATTTTTTAAT 
TCTATTTTTCCACTTAGAATTCTAATGATAATAGGTTCTGAGAAAAAGAGGTTAGGCATC 
GGTAGTTTTTATTTGCTAAACCTACTATGGATTATTTGGTGTCTTATGATTCATAGAGAA 
CAAGTCCCATTAAATAACTTAACCCTCCTATTATCCTTCATATTTTCATTATTTTTTTTA 
40 TTATGTGATTTTGTTTTATTATTGAATGTTTATGTTTATTTTTTTAAATTAAGAGAGGCT 
TAATATGGTTAATCAAATCAAATCTGATAATAATTCAGTTTCTATTGAATTTATATAAGA 
TTTTATAACTGCAAGTACGGATGTAATTAATCTGAGTTACGAAAATTTTCGTAAAAATTT 
TTATACACAAATGTCAACTGATTCTACCAATTATGCAGCCAAACATGAAAGTTTAGGAAA 
ATCGGTACAACGTGAATTACAAAAAACACAAAGTCAGTTGAGACAAGTTGTAAGAAAAAT 
45 GCAGAGTAAATATAATATAAATAATAAAGCACGAGTAGCAGAAATATCTTTGTTAAGGCA 
AATGCAAAGCCAATTTTCTCGAAAATATGTAAACAAAAATCTTGGTAACAGCAACACTTT 
GGCTCAACAAGGCAGCTACACCAAAAAAGACGGCACAACCGCGCAAGCAGGCGATTTGCT 
GTTGGCTGCTGACAACCTGCACAGCCGCCTCACGGACAAAATGCTATCCATTAGCCATGT 
TCGGGAAAACACGATTTCCCCGTTTGTTTTAGGCTGTCTAAAACAAATAACCATAAATGC 
50 ATATCATTATTTAAAATAAATAAAAGTATTTAACTATTTTTGACAAAATTTTAGAAATAG 
AGCTAGAGTTTTAGTTAAGTAGAAATTGATAGTGCTTCAAGGGAAGTATTCTCTATGTTT 
GCATTAAAGGGGGTCTGATAAAGCTATTATTCATTACTATGGACTTTTATTTCATTATTT 
TCAGGCGGAAATCTCATAGCCGTTTTGAATTTTTCTCTTCCTTATTAATTATACAAATAA 
TTAGTATATTCTGATATGGATTTTTTGGAAATTTTTATTATGTCTGCATTTAGAAAAATA 
55 TTATTAATAATATCTTGCCTATTGATTGCTAGCTGCAGTTTTGTTGAAACTATTTTTTAT 
ATGGCTATTAGCCCAGAACCTGTTGTGGTAGACTTTCCTCTTGGTAAAAAAACAAAAAGA 
TCTATTGAACTCAAACAGAAAATTGGTAAACCTTATGCAATATCGTTAGGAACTAATTTT 
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ATACATTATGATCCAAAACAGGGGGAGAGGTGGATTGATGATAAGTTAAACTATCCATAT 
AATATATCGGTTAAAATATTTAAAGTGGAAGAAGATGGTAAAAAACTTATTATAGATGAG 
TTGCTTACAGAGAGAAGTAGAAAATTAGGAGGCGGAGTATTTGGAGCTGGGGGAAAATAC 
AGTATGCATATTTATGATTTTTATTTGCCGGAAGGGGAATATTTATTTGAGATTTCTGAT 
5 AATAGTGAATATATTCCACTTTACGATGAAATAAATAATTCTATAAGAATAGTAGTTAAT 
GCACGAATTCAGTAAATTTTTCTAGAAATGTGGGGTTACTTATGGCTGATTATTATGCGA 
TAACTGTAAAATTTGCGAAGCAGGGTACGCCACTGAAACAAGAGGGGGTGTATCCAAGAC 
GGGTACGTTTGGGTTGAACTGTATTCGGCTAGAGATAAAAAAATCGGGGCTGTACTAGAT 
TAGCCCTAAATTCCACACCAATCCCGCAGGATTTTAAGCTGTTGAGACGGTGTGCCGAAG 
10 TTAAATCGTW^TTCGCATTCTTTCAAGAACAGCGGGAAAGATTTACGATCGATTCCGTTG 
TATTTTCGCAAGACGCGTTTTGCCTGATTCCAAAT^TTCTCAATGCCGTTAATGTGGTTC 
TGACGGTCTGCAAATTCCTTGGAATGGTTGATGCGGTAATGGATAAAACCGCTCACGTCC 
AACTTGTCGCAGCTGCTCAGACTATCGGTATAAACAATACTGTCCGGCATGATTTTCTTT 
TTGATGACAGGGAGTAACGTTTCAGACTTGGCATTATCTACGACAACGGTATAGCCCCGT 
1 5 CCGTTGCGTTTCAGAATGCCGAAGACAACCACTTTTCCTGCCGCACCGCGACCACGTCTG 
CCTTTACGCCGTCCGCCGAAATCGCTTTCGTCCGGCTCGACAGGGCCCTCAAAAACCTCA 
TCGGCAGCCAAGGCCAAATGATGGTTGATAACCGTGCGGATTTTACGGTAGAACAGTACT 
GCCGAATTGGGATGGATACCCAAAATATCGGCGGCAGAACGGGCGGTAACTTCCAGCACA 
AAAAAACGGAGCAGTTCTTTCTGTACTTTTTTCTTTAATTTGCAGTGCGTTATCTTCATA 
20 TTTCGAGGGTAACATATCTGCTAATCTAGTACAGCCCCAAAAATATACCAAAAACAGCAA 
AACAAATTGTAAGGATAGGTATAGGCTTTGTAAAGGTAAATTGTGAAAAAAGCAGTTTTT 
TAAACGAATGAAACGGCTTCGGGCTGAAATATATGCTGATGCCCTGTCCTTCCCGTATAT 
CTTGTGTGTTGTCAAAGTGCAGGCTGCTTTGAAATCGGTATTGCCATCTATGAACCACCA 
CTTTGTTTTATTTCAGCGGGCTTGAGATGTGTATAAGAATATTGTTTTGAATAAATTTAA 
25 AAAAATGATAATCGTTATTGACGATTTTTAAAGGAAAGCGTAGAGTGCCAATTCTATGAA 
GCAATACGGTAAGTAACAATGAAAATATCTACTGCTTGGGTATAGAGCATATTTCACAAC 
CCGTAACTATTCTTGCGGAAACAGAGAAAAAAGTTTCTCTTCTATCTTGGATAAATATAT 
TTACCCTCAGTTTAGTTAAGTATTGGAATTTATACCTAAGTAGTAAAAGTTAGTAAATTA 
TTTTTAACTAAAGAGTTAGTATCTACCATAATATATTCTTTAACTAATTTCTAGGCTTGA 
30 AATTATGAGACCATATGCTACTACTATTTATCAACTTTTTATTTTGTTTATTGGGAGTGT 
TTTTACTATGACCTCATGTGAACCTGTGAATGAAAAGACAGATCAAAAAGCAGTAAGTGC 
GCAACAGGCTATIAGAACAAACCAGTTTCAACAATCCCGAGCCAATGACAGGATTTGAACA 
TACGGTTACATTTGATTTTCAGGGCACCAAAATGGTTATCCCCTATGGCTATCTTGCACG 
GTATACGCAAGACAATGCCACAAAATGGCTTTCCGACACGCCAGGGCAGGATGCTTACTC 
35 CATTAATTTGATAGAGATTAGCGTCTATTACAAAAAAACCGACCAAGGCTGGGTGCTCGA 
ACCATACAACCAGCAAAACAAAGCGCACTTTATCCAATTTCTACGCGACGGTTTGGATAG 
CGTGGACGATATTGTTATCCGAAAAGATGCGTGTAGTTTAAGCACGACTATGGGAGAAAG 
ATTGCTTACTTACGGGGTTAAAAAAATGCCATCTGCCTATCCTGAATACGAGGCTTATGA 
AGATAAAAGACATATTCCTGAAAATCCATATTTTCATGAATTTTACTATATTAAAAAAGG 
40 AGAAAATCCGGCGATTATTACTCATCGGAACTATCATAGGTATGGAGAGAACGATTACAG 
CACTAGCGTAGGTTCCTGTATTAACGGTTTCACGGTACGGTATTACCCGTTTATTCGGGA 
AAAGCAGCAGCTCACACAGCAGGAGTTGGTAGGTTATCACCAACAAGTAGAGCAATTGGT 
ACAGAGTTTTGTAAACAATCCAAGTAAAAAATAATGGGGCTGTCCTAGATAACTAGGATA 
AACTCGATTTTACTAATTGTTTTAAAATGGAACAAGAACTTTTATCTCACTGTTGTTAAA 
45 ACGCCATTCGCACTCCTTTAAATACAGCTCAAAATGCGCTTTGGGAATGCCGTTAAACTT 
GCGTAAATGACGTTTTGCCTGGTTCCAAAAGTTCTCAATTCCATTAATATGGTTTTGTCG 
TTCAGCAAAATGTGTGCTGTGATTGATACGAAAACGAAGTTTCAGCGAAGCTAAAATGGC 
TAAATTCGCGCACATCTAATACATCATAGCTACGATAACAATCCGTATAAATAATGCTGT 
CAGGTTTCACTTGTTCACGGATAATAGGAAATAAAGTAGCGGTTTGAGTATTCGGTACTG 
50 TAACCGTATAAACCTTACCATTTCGCTTCAAAAGACCGAATACGGCGACTTTACCGGCAG 
CACCGCGACCGCGTTTGCCTTTGCGTTGTCCGCCAAAATAACTTTCATCTGCTTCTACTT 
CGCCATCAAACATTTCCAAATGCGGACTGTTTTGATAAATAAGTAATCGTAAACGATGAA 
AATAATAGGCTGCGGTATTTTTATTAACGCCTACTAACTCTGCTGCCGTTCTTGCAGTTA 
CACCTGTGACAAATAGCTCAATGAGTTTATTTTGTTTATACTGGCTTAGACGACTTTTTC 
55 TCATAGGGATAATTCTAACTTAATTTGAATTTCCCTAGTTATCTAGGACAGCCCCTATTC 
TTTAACTAATTTCTAAGCTTGAAATTATGAGACCATATGCTACTACCATTTATCAACTTT 
TTATTTTGTTTATTGGGAGTGTTTTTACTATGACCTCATGTGAACCTGTTAATGAACAAA 



wo 00/22430 



-29- 



PCT/US99/23573 



CCAGTTTCAACAATCCCGAGCCAATGACAGGATTTGAACATACGGTTACATTTGATTTTC 
AGGGCACCAAAATGGTTATCCCCTATGGCTATCTTGCACGGTATACGCAAGACAATGCCA 
CAAAATGGCTTTCCGACACGCCAGGGCAGGATGCTTACTCCATTAATTTGATAGAGATTA 
GCGTCTATTACAAAAAAACCGACCAAGGCTGGGTGCTCGAACCATACAACCAGCAGAACA 
5 AAGCACACTTTATTCAATTTCTACGCGATGGTTTGGATAGCGTGGACGATATTGTTATCC 
GAAAAGATGCGTGTAGTTTAAGCACGACTATGGGAGAAAGATTGCTTACTTACGGGGTTA 
AAAAAATGCCATCTGCCTATCCTGAATACGAGGCTTATGAAGATAAAAGACATATTCCTG 
AAAATCCATATTTTCATGAATTTTACTATATTAAAAAAGGAGAAAATCCGGCGATTATTA 
CTCATTGGAATAATCGAGTAAACCAGGCTGAAGAAGATAATTATAGCACTAGCGTAGGTT 
10 CCTGTATTAACGGTTTCACGGTACAGTATTACCCGTTTATTCGGGAAAAGCAGCAGCTCA 
CACAGCAGGAGTTGGTAGGTTATCACCAACAAGTAGAGCAATTGGTACAGAGTTTTGTAA 
ACAATTCAAGTAAAAAATAATTTAAAGGATCTTATTATGAATGAGGGTGTiAGTTGTTTTA 
ACACCAGAACAAATCCAAACCTTGCGTGGTTATGCTTCCCGTgGCGATACCTATGGCGGT 
TGGTATCCGAGCT 

15 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 4>: 



gnm_4 

CGGGGCTTACAACATCGGCGCAATCGACAAAAAGACCATGCGCGACTTTGACAAGTCCTG 

20 CCTGACCGAAATCAAACCGTTGAGCGGCGGAGACATCAAGGCAATCAGGGAGAAGGAGGC 
ACTATCGCAAGCCGCTTTCGCCATCTATCTCAACGTGGGAAAAAATCACGTTTCGGCTTG 
GGAGCGGGGCGTTAAAAAGCCGAGCGGCGCGGCGTTGAAGCTGCTGACCATCGTCAAAAA 
CAAGGGCATCGAAGCCATTGCGTAGCCGACTTGGCAAACGGCAAAATCAGCAAGTTCACA 
ATAGACGCGCTGCTGAATATGCCTGCCAAGACAGGCAAGACCGCCGAACTGAATATCAGG 

25 GCGTAGCCGCATAAATGCCCGACCGCATCAAACCAAGCCGAAACGGCGGCGGTGCAGACG 
ACATAGCCCGACAGCAAGGCACGGCGCAGACGGGCGCGAAACCCGAAACATCACCGACCG 
CGAGGTACGGGGATTTTTTGCGCCCGTTGCAGGGGGGGATTGGATTTAAGCGGCGCGGGC 
TTGAAGGCAAAACGGGTGGGGCACAGAACTGTTTAAATGCAGTCTGAATCTCAAACGATT 
TCAGACGGCATTTTGAA/iCAATGGCTCAAATTCTCGATCCCCTTCCCTTAACGCCGACGT 

30 TTTTTATTAACGCGCCCCTTATTTCTGACACTTTGCTCATAAACCGGCATAACGGTCGGC 
AACAACCGTTTTAGATTTTCTATACGGGCATTGTTTGTCGGATGAGTAGAGGTAATAGCA 
TAAATAAAGCCGTTTTGGTCGTTTTCCTGATTCATTTTTTCCCAAACCCTGACAGCGGCC 
GCCGGATGATAGCCTGCCTGCGCCATCAACATCATTCCCCCCTCATCGGCTTCTTCTTCC 
AAGCTGCGGCTATAAGGCAAGGTAAGACCGTACGTCCCCAAAATATCCATACCCAATCCG 

35 ACCAATTCCGGATTAGTATCCGGTTTTTTGTCTAATATAATCTGCGTGCCTATCTGCGCC 
GCCGTATTGGTCAAGATTTGCTGCCCGACCTTATTTTTACCGTGTTCATGCAGGGCGTGC 
GTCATTTCATGCCCCATAATGGCGGCAATTTCGTCATCGGTCAGCTTGAGTTTGTCGACT 
ATCCCCGTATAAAACGCCATTTTTCCACCGGGCATTGCCCACGCGTTCAGCTCATCGTTT 
TTGAAAACCGTCATTTTCCAGTCAAACTTATGGCTGGTATTATTTGCCGCATCGGCATAA 

40 GGCAGCATACGTCGAAATACTGCCTGCACCCTGCGGGCTGTTCTGGATGTGGTATCGACA 
TTGCCGGCAGACTTGTTTAACTCAACCGTTTTCATATAATCTTTGGCAGCCGCAGCGTTC 
ATTGTGGCGGAATCATGACCGTAAACATCAGCAACGACCGCACAAGCCCCCAATACCGAG 
ATTACTGCCGACAGGCAGAGTATCCGTTTAAAGGAAGGAAGGAAGGAAATTTCATATTTA 
GGTTTACTCCTTAAAAAATTAAATTTCAAAAAAATGCCGTCTGAATCCAAAACGGATTTC 

45 GGACGGCATCTTAACATTGTTTAATGTTTTTAAAAAGATTTACACCACGATGTTCTCCAG 
TCTGCCCGGTACGGCGATGATTTTCTTGGCAGGCTTGCCTTCTATGAATTTCACCGCGCC 
TTCAGCGGCGTATTCGGCAGCTTCTTCAGCCGGTTTGTCGAAATCACGCATAAATTGCCA 
ATAATTCTCCAACTTTTTTACGGCTGCTGCTGCCTTTTGCGGCAATATTGCGCTGAACTT 
CAACTGTTTTCAAAATGGCAGAAGAATAAATATCCCTTGTGAATTCAGTATCATGATTTG 

50 AAATCAAAATACCTTGGGAGTTGGGCGCAATTTATTGATTTTTTGTAAAGTCCGCGACCA 
ATGAATTCGATCGTATTTTGGTCGCGCAGAATTTGCAACTGTTGGCGGATTTTGTCTCTG 
ATATGGTTGTTTTGGGGAAATTGGATGGATAGTTTGTTTTCAAATTCATACATTTGCGAC 
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AATGTGAATTCTTCGGGGAGTTGGTCGATACATTTCATAACAGCCAGAAGCCAGCCTTTG 
CGCTCCGCATTTTGGTTGCGTAAAAACAAATTGGATTGCCATTTTTTCAGAACGGTTTCG 
GGTTCGATAATGCGGGAATTGTCTATTAAGAATATTTTGCCGCTTTCAGGCAAAGGGGCG 
AGATTGATAGAACACATAATGTGGTTCGGCCGGTTTTTAATGCCTTTATTTCTGGGAATA 
5 ATCATATCCGGCGTGATGAAATGTTTGGGTACAAGCACCAATTGCCGTATGGAGTAATCC 
GCTTTTTTATATGCAAGAAAGAAAAAGTTGGGGTTGGTATCTGACCGGATGCGCTCCAAC 
ATGGTGTGATATGCACCGTCAGGCACGCTGTTGCCTATGGTTTTTTGATTTTTACTCTTT 
AATTCATATTGCTCGTGGCAATTTGGGCAAAAGAGGTCTGCAACAGGTTTGTTATTGGCA 
AATCTCTGCATCGGCTTGCTTCCGCAACAGGGGCAGTAGCCGTTTTTTTCCAACCAAGCC 
10 TCGCTCATTACACGGATTTTATGGGTTGCTTTATTTTGTTGCTTTCCCAATTCGGTATCG 
AAAAATAAATTCATGTTTTGGATTTTGAGATTTCAGTTATTCGGGGTTCGTCATGCAGAC 
AACACAATCCACCTTAAAAAGGCCGTCTGAAACCCTGTTTCCAAGTTTCAGACGGCCTTT 
ATCCGTGTGGCTAAACCTTAAAAGCGGTTAGACGACGATGTTCACCAGTCTGCCCGGTAC 
GACGATGATTTTCTTCGCCGGTTTGCCTTCCATGAATTTCACCGCGCCTTCGGTGGCGAG 
1 5 TGCGGCGGCTTCGAGGTCGGCTTTGGATGCGTCGGCGGCAACAGTGATTTTGCCGCGCAG 
TTTGCCGTTGACTTGAACCATCACTTCGATTTCGGATTTGACCAAGGCGGCTTCGTCGAC 
TGTCGGCCAGCCTGCTTCCCACAGTTTCGCGCCGTTCAATTCGCTCCACAGGGTTTCGCA 
GATGTGCGGCACGATGGGCCACAACAGGCGTACGGCGGTTTCCAATACTTCTTGGGCGAC 
GGCGCGTCCTTGTTCGCCGCCGGTGTCGGTTTTGTCGTATTGGTTGAGCAATTCCATCAC 
20 GGCGGCGATGGCGGTGTTGAACTGCTGGCGGCGGCCGTAGTCGTCGCTGACTTTGGCAGT 
GGTCGCGTGCAGTTTGTGGCGCAGGTCTTTGAGTTCTTTAGACAAACCGTCTTGGCTGCC 
TGCGAACGCTTTGACCGCTTCGCCTTGCTTCAAGTATTCGTAAACGGTACGCCACAGGCG 
GCGCAGGAAGCGGTGTGCGCCTTCGACGCCGCTGTCGCTCCATTCGAGGGACTGTTCGGG 
CGGTGCGGCGAACATCATAAACAGGCGGGCGGTGTCCGCGCCGTAGGCGTTAATCAGTTC 
25 TTGCGGATCGACGCCGTTGTTTTTGGACTTGGACATTTTTTCCGTGCCGCTGATGACGAC 
GGGCAGCCCGTCGGCTTTGAGGACGGCGGAAATGGGGCGGCCTTTGTCGTCGAACGTCAG 
CTCGACATCGGCGGGGTTGATCCAATCTTTGCCGCCTTTGTCGTTTTCGCGGTAGTAGGT 
TTCGCAAACGACCATGCCTTGCGTCAGCAGGCGTTCAAACGGTTCGTCAACATTGACTAG 
ACCTTCGTCGCGCATCAGTTTGGTGAAGAAACGCGCGTACAAGAGGTGCAAAATCGCGTG 
30 TTCGATGCCGCCGATGTATTGGTCGACCGCGCCCCAGTATTTCGCGGCGGCAGGATCGAC 
CATGCCGTCTGAAAATTTTGGCGACATGTAGCGGAAGAAATACCAGCTCGATTCCATGAA 
GGTGTCCATGGTGTCGGTTTCGCGTTTCGCCGCGCCGCCGCAGCATGGGCAGGCAGTTTC 
GTAAAACTCGGGCATTTTTGCCAGCGGCGAACCCATGCCGTCGGGTACGACGTTTTCAGG 
CAAAACGACCGGCTy^TTGGTCGGCAGGGACGGGTACGTCGCCGCATTGTTCGCAATGGAC 
35 GATGGGAATCGGGCAGCCCCAGTAGCGTTGGCGCGAAATGCCCCAGTCGCGCAGGCGGTA 
TTGGGTTTTCGGTTCGCCCGCGCCTTGGCTTTGCAGCTTGGCGGCGACGGCGTCGAATGC 
CGTCTGAAAATCCAAGCCGTCCAAGTCGCCGCTGTTGACCAATACGCCGTTTTCTTTGTC 
GCCGTACCATTCTTGCCATTGGTTTTCGTCAAATGCGTTGTCGCCGACGGCAATGACTTG 
TTTTTTCGGCAGATTGTATTTGGTGGCGAACTCAAAATCGCGTTCGTCGTGCGCCGGAAC 
40 CGCCATCACCGCGCCGTCGCCGTAGCCCCACAATACATAGTTGGCAATCCACACTTCCAG 
CTTGTCGCCGTTGAGCGGGTTGACGACGTAGCGGCCGGTCGGCACGCCTTTTTTCTCCAT 
CGTCGCCATATCGGCTTCGGCAACCGAACCGGCTTTGCATTCGGCAATAAATGCCTGCAA 
TTCGGGTTTGTCGGCGGCTGCGGCGGCTGCCAGCGGATGC'rCGGCGGC.VS.CGGCAACATA 
AGTCGCACCCATCAGCGTGTCGGGGCGGGTGGTATAAACTTGCAGGAATTTCGCGTAATC 
45 GCCTTCCAAGCCTTGTTTGCTGTCGTCTGAAACGGCGAAGCGCACGGTCATACCGCGCGA 
TTTGCCGATCCAGTTGCGCTGCATGGTTTTGACTTGTTCCGGCCAGTGTTCCAGCTTGTC 
CAAGTCGTTGAGCAGCTCTTCGGCGTAATCCGTGATTTTGAAGTAATACATCGGGATTTC 
GCGTTTTTCGATCAATGCGCCGGAACGCCAGCCGCGTCCGTCGATGACTTGCTCGTTGGC 
AAGGACGGTTTGGTCGACAGGGTCCCAGTTTACCGTGCCGTTTTTGCGATAAACGATGCC 
50 TTTTTCAAACAGCTTGGTAAACAGCCATTGTTCCCAGCGGTAGTATTCGGGTTTGCAGGT 
TGCGGTTTCGCGCGCCCAGTCAATCGCAAAACCTAGGCTTTTGAGCTGGGTTTTCATGTA 
TTCGATGTTATCGTACGTCCAAGCGGCAGGGGCGACGTTGTTTTTCATCGCCGCGTTTTC 
CGCCGGCATGCCGAACGCGTCCCAACCCATAGGCTGCATGACGTTGAAGCCGTTTAAAAG 
TTTGAAGCGGCTCAATACATCGCCGATGGTGTAGTTGCGCACATGCCCCATGTGCAGCTT 
55 GCCGCTGGGATAGGGGAACATGGAGAGGCAATAATATTTGGGTTTGGAAGCGTCTTCGGA 
GACGTTGAAAATACGGGCGTCGTCCCATTTTTTCTGCGCCGCAGGCTCAATGGCGGCGGG 
CCGGTATTGTTCTTGCATAGTCATTCTGTTTTCGCTTAAAAACGTTGGAAAAATAAAGTC 
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GGCATCAATTATAACAGGTTGCCGGAAGCGGCGAATCGGCAC-ATTGCCGGCAGGATGCGT 
AAATTCGCACGCGCATTATTCCGTATGCCGTACAAATACACCGCGTTTATTGATACGCAC 
GTTTTTTATGCTAATATTACAAACCAAAATCAAATGTTTAAAACTCTCCTGATGCGGCTC 
TTCCGAACAAAAGGCAGACGGGCATCGGGTAAAAGAGGATTCTGCATATGAAAATCAAAC 
5 AAATCGTCAAACCGGGCTTGGCAGTATTGGCGGCGGGCGTTCTGTCTGCCTGCGCAACCA 
AAAGCAACGTCAAAGCCGACGGCACGACCGACAATCCGGTTTTCCCGAAACCCTATTCCG 
TAACGCTCGACAACAAGCGCGGCACATTCCCGACTTATGACGAACTGGATCAGATGCGCC 
CCGGCCTGACCAAAGACGACATCTACAAAATCCTGGGCCGCCCGCATTACGACGATy^GTA 
TGTACGGCGTGCGCGAATGGGATTACCTGTTCCACTTCCATACCCCGGGCGTAGGTATCG 

10 ACCCTGAAAACACTTCCGGCGTAGAAGATGTTACTACCTGCCAATACAAAGTGATTTTCG 
ATAAAGACAAATTTGCCCGCAGCTTCTACTGGAACCCCGTCTTCCCGAAAGATGCCGCCT 
GTCCGCCGCCCGCACCCAAAGCCGAGCCGCAAGTCATCATCCGCGAAATCGTGCCGGCAA 
AACCGAAACGTATCCGCCAATAATCCGACATGCCGTTCCGCCTGTTTTTAGGGATATTAT 
GCGGCCTGTCAATGGTTGCCCCCGTATATGCACAGGGGCAGCCGGATACGGTCGGCGACT 

1 5 TTATCCAAAAGAAAAAAGTCATCGTCGATACATCCAAAGCGGAACTCTGTTTCGCTGACG 
ACCGTCAGTGCCACCCCGTCCTCATCGGTGTTGCCACGCCCAAGGGGACGTTCGGGCTGA 
CGCTGAACAGTACCGACAAGCCCGGATACGGCGGCGAAGTCATCGGTTTCAAGCAGGAGG 
GTGATTTTCTTTTCGCCCTGCACCGCGTTTGGAATCAGATACCGTCGGAAAGGCGGAACG 
AACGCATCGCCTCCCCGTCCGTGTCCGACAGGATTATGACCAACGGCTGCATCAACGTCA 

20 GCGATGCGGTGTACGAAAAACTGCGTCATTATTTTGTGTTGGAAGTGATTTGAAACAGAC 
GGATACCGCACGCGCCGGTATCTGTTTTCACATTGCCCCGATGCCTGAAACAGACTGTCC 
GCCACGTCATGCCGTCTGAAACCGGCGCAGATGCCGCCAAGCCTTCAGACGGCATTGCCT 
GCCCGCTCCGACCGAACAACAACCATCTTTGGGAGAACCTTATGCCCGAACAAAACCGCA 
TCCTCTGCCGCGAACTGAGCTTGCTGGCATTCAACCGCCGCGTGTTGGCGCAGGCGGAAG 

25 ACCAAAACGTCCCCCTTTTGGAACGCCTGCGCTTCCTGTGCATCGTTTCATCCAACCTCG 
ACGAGTTTTTCGAAGTCCGTATGGCGTGGCTGAAGCGCGAACACAAACGCTGCCCGCAGC 
GCAGGCTGGACAACGGCAAAATGCCGTCTGAAACCATCGCCGACGTTACCGAAGCGGCGC 
GCTCCCTGATACGGCACCAGTACGACCTGTTCAACAACGTCCTTCAGCCCGAGCTGGCAC 
AAGAAGGCATCCATTTTTACCGCCGCCGAAATTGGACAGACACACAGAAAAAATGGATTG 

30 AAGACTATTTCGACCGCGAATTGCTGCCGATCCTGACCCCCATCGGACTCGACCCTTCCC 
ACCCCTTCCCGCGCCCGCTGAACAAATCGCTCAACTTCGCCGTCGAACTCGACGGCACAG 
ACGCGTTCGGCAGGCCTTCGGGGATGGCGATTGTGCAGGCACCACGCATCCTGCCGCGCG 
TTGTTCCCCTGCCGTCCGAACTGTGTGGCGGCGGACACGGCTTCGTCTTCCTCTCCTCCA 
TCCTGCACGCCCACGTCGGAAAACTCTTCCCGGGCATGAACGTCAAAGGCTGCCACCAGT 

35 TCCGCCTGACGCGCGACAGCGACTTGACCGTTGACGAAGAAGACCTGCAAAACCTCCGCG 
CCGCCATTCAAAACGAGTTGCACGACCGCGAATACGGCGACGGCGTGCGGCTCGAAGTCG 
CCGACACCTGTCCCGCCTACATCCGCGACTTTCTGCTCGCGCAATTCAAACTGACCGCCG 
CCGAACTCTATCAGGTCAAAGGCCCGGTCAACCTCGTGCGCCTCAACGCCGTCCCCGACC 
TAGTCAACCGCCCCGATTTGAAATTTCCCACACACACGCCGGGCAGACTGAAAGCCTTGG 

40 GCAAAACCGCGTCCATATTCGATTTGGTGCGCCAATCGCCCATCCTGCTGCACCACCCCT 
ACCAATCGTTCGACCCCGTTGTCGAAATGATGCGCGAAGCCGCCGCCGACCCCGCCGTGC 
TTGCCGTCAAAATGACGATTTACCGCACCGGCACGCGTTCCGAACTCGTCCGCGCCCTGA 
TGAAGGCGGCACTCGCCGGCAAACAAGTAACCGTCGTCGTCGAACTGATGGCGCGTTTTG 
ACGAAGCCAACAACGTCAACTGGGCGAAGCAGCTCGAAGAGGCGGGCGCGCACGTCGTGT 

45 ACGGCGTGTTCGGCTACAAAGTCCACGCCAAAATGGCACTGGTCATCCGCCGCGAAGACG 
GCGTGCTCAAACGTTACGCCCATCTCGGCACGGGCAACTACCACCAAGGCACATCGCGCA 
TCTACACCGACTTCGGCCTCATTACCGCCGACGAACAAATCACCGCCGATGTGAACATAT 
TGTTTATGGAAATCACAGGTTTGGGCAAACCCGGGCGGCTGAACAAACTCTACCAAAGTC 
CGTTTACCCTGCACAAAATGGTTATCGACCGCATCGCACGCGAAACCGAACACGCAAAAG 

50 CCGGCAAACCGGCGCGGATTACCGCCAAGATGAATTCGCTCATCGAACCGACCGTCATCG 
AAGCCCTGTATCGGGCAAGCGCGGCAGGCGTACAAATCGATTTGATTGTGCGCGGTATGT 
GCACCTTGCGCCCGGGTGTAAAAGGCTTGTCCGAAAACATCCGCGTCCGCTCCATCGTCG 
GCAGGCAGCTCGAACACGCGCGCGTGTATTACTTCCATAACAACGGCACGGACGATACCT 
TTATCTCCAGCGCGGATTGGATGGGGCGCAACTTCTTCCGCCGCATCGAAACCGCCACGC 

55 CGATTACCGCGCCCGAACTCAAAAAGCGCGTTATACATGAAGGACTGACCATGGCACTGG 
ACGACAACACCCACGCGTGGCTGATGCAGCCCGACGGCGGCTATATCCGCGCCGCACCTG 
CCGAGGGCGAATCCGAAGCCGACCTGCAAAACGATTTGTGGACACTGCTCGGAGGCTGAC 
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CCGCACCGCCCCAATCAAAAACCATGCCGTCTGAAACCTTTCCGTTTCAGACGGCATGGT 
TTTACAGCAATCTAAACAGGGCGGACCGGAGTCAAAAACACACCTTCGCCATTCCTGCAC 
AAGCACTTCCCCTATACGCTCCCAACCCCAAGCCGCCGCATTCCAGACGGCATTATAGTG 
GATTAAATTTTAGGGGCTGTACTAGATTAGCAGATATGTTACCCTCGAAATATGAAGATA 
5 ACGCACTGCAAATTAAAGAAAAAAGTACAGAAAGAACTGCTCCGTTTTTTGTGCTGGAAG 
TTACCGCCCGTTCTGCCGCCGATATTTTGGGTATCCATCCCAATTCGGCAGCACTGTTCT 
ACCGTAAAATCCGCACGGTTATCAACCATCATTTAGCCTTGGCTGCCGATGAGGTTTTTG 
AGGGCCCTGTCGAGCCGGACGAAAGCGATTTCGGCGGACGGCGTAAAGGCAGACGTGGTC 
GCGGTGCGGCAGGAAAAGTGGTTGTCTTCGGCATTCTGAAACGCAACGGACGGGGCTATA 
1 0 CCGTTGTCGTAGATAATGCCAAGTCTGAAACGTTACTCCCTGTCATCAAGAAGAAAATCA 
TGCCGGACAGCATTGTTTATACCGATAGTCTGAGCAGCTGCGACAAGTTGGACGTGAGCG 
GTTTTATCCATTACCGCATCAACCATTCCAAGGAGTTTGCAGACCGTCAGAACCACATTA 
ACGGCATTGAGAATTTTTGGAATCAGGCAAAACGCGTCTTGCGAAAATTATAGTGGATTA 
ACAAAAATCAGGACAAGGCGACGAAGCCGCAGACAGTACAAATAGTACGAAACCGATTCA 
15 CTTGGTGCTTCAGCACCTTAGAGAATCGTTCTCTTTGAGCTAAGGCGAGGCAACGCCGTA 
CTGGTTTTTGTTCATCCACTATACCTTTCCGACAGCCGAACAAAACCCCGAATCCGTCTG 
CACGGTTCGGGGTATATCTCCAATACGGGCATCGTGTTCCGGAAAACCGTCAAATCCGCA 
TCGGCATCACAATATATTTGAAATTCGGATTGTTCGGCACGGTAAACAGCGTCGAGCGGT 
TGGCATCGCCGAAGGCAAGCTGCATATCGTCGGAATGGATGTTGCGCAACACGTCCATCA 

20 GATAGCCGATATTGAAACCGACTTCGAGTTCGCCGCCCTGATAGGCGATTTCGATTTCTT 
CGCGCGCTTCTTCCTGCTCGTTGTTGCTGCACACAACGCTCAACAGGCCGGGTTGCAAAA 
ACAATCGCGCACCGCGGAATTTTTCATTGGCAAGAATCGATGCACGTTCCAACGCGCCCA 
ACAATTCTGCCCTCGACAACACGAAAATCTTGTCGTTGTCCAAAGGAATCACGCGGTTGA 
AATCGGGGAATTTGCCGTCGATGACCTTGCTGACGATGGTCGTGCCGTTGCATTGGAAAC 

25 GCACCTGTTTGTCCAGCAGCTCGATTTGAATCGGATCGTCGGGGTTGTTCAACAGTTTGA 
ACAGTTCCAGCACCGTTTTGCGCGGCAAAATCACTTCGGCGCGCGGCAAATCCGCATCAA 
TCGCGCAGGCTGCATAGGCAAGGCGGTGTCCGTCGGTCGCCACAAGGCGCAACTGGCTGC 
CCTCAACCTGCATCAGCAGACCGTTGAGATAATAGCGGATGTCCTGCACCGCCATGCTGT 
ACTGCACTTGCGACAGCATGGTTTTGAAACGCTCCTGCTCCAGCGAGAAAGTCGCGCTGA 

30 TGTCCTCGCCGACATTCATCATCGGAAAATCGGCGGCAGGCAGGGTCTGCAGGGCAAAAC 
GCGATTTGCCCGCCTTCAGCGTCAGACGGCTGTCGTCCCAATCCAGCGACACCAGCGCAC 
CGGCAGGCAGCGCGCGCAAAATATCCTGAAATTTCTTGGCATTGGTGGTGATGCGGAAGT 
CGCCCGCGCCGCCCTCGGGACCCGCAGTGTCGATTTGGATTTCCAAATCGGTTGCCAAGA 
GTTTGGTCTGACCGCCTTTTCCCTCAATCAGGACGTTGGACAGGATGGGCAGGGTGTGGC 

35 GGCGTTCGACGATGCCGGTAACGGCTTGCAACGGCTTGAGCAGGCTGTCGCGCTCGGCTT 
GTAAAATCAACATGTTCGCTCCTTTAAATCGGTTTGTATAGTGGATTAAATTTAAATCAG 
GACAAGGCGACGAAGCCGCAGACGGTACAAATAGTACGGAACCGATTCACTTGGTGCTTC 
AGCACCTTAGAGAATCGTTCTCTTTGAGCTAAGGCGAGGCAACGCCGTACTGGTTTAAAG 
TTAATCCGCTATATCTTTACCCTTCGGACGGCATGGGCAATATCATGTCGTCTGAAAACG 

40 TTTTCCATCAGTTTTG7VATCAGAATCAGCAGCTTTTCATAATCCTGAGCCAATTCCGGAT 
CTTCTTCGCGCAGTTTCGCCACTGCCCTGATGCCGTGCATAACGGTCGTATGGTCGCGCC 
CACCAAACGAATCGCCGATAGACGGCAGGCTCAAAGTAGTCAGTTCTTTGGTCAGGCTCA 
TCGCCACCTGGCGCGGACGGGCAATGTTTCGTGTCCGTTTCTTACCGAGCACATCGCTGA 
TTTTGATGCGGTAATATTTCGCCACCGCATCGATGATGATGTCGGCGGTGATGACTTTGT 

45 GCTTCTCGGCAATAATGTCCTGCAAAGCGGTACGCGCCAAATCGATGTCGATGACGGGAC 
GGTTCATAAAGCGGCTGCTCGCTCCGACACGATTAAACGCGCCTTCAAGCTCGCGCACGT 
TGGAACGGATCAGATTGGCAATGAACAGCGCGGCTTCGtCTTCGATACTGATGCCCGCCG 
CTTCCGCCTTTTTCTGCAAAATGGCGATGCGCATTTCCAATTCGGGCGGCTCGAGTTCCA 
AAGTCAGTCCCCATGAAAAACGGGATTTGAGGCGGTCGTCCATGCCTTCGATTTTCGCAG 

50 GCAACACATCGCAAGTGAGGATGAGCTGTTTTTTCTCGTTGTGGAAATGGTTGTACAGAT 
AGAAAAACTCTTCCATCGTACGGTCTTTGCCTTTGATGAACTGGATGTCGTCGATAATCA 
GCAGGTCGTATTGCTTGTATTGCTGCTTGAACACGTCGTAAGTGTTGTTGCGAACCGCCT 
TCATAAAGCTGCGGATATAGTCATCCGAATGCATATAGCGCACTTTGGCATCGGGACGGT 
TTTTCAGCAGCTCGTTGCCGACCGCCTGCACAAGGTGGGTTTTGCCCAAACCCGTGCTGC 

55 CATAGAGGAAGAACGGGTTGTAACTCTGCCCCGGGCTTTCCGCAATCGCCTGCGCCGCAG 
CCGCCGCAAGGCGGTTGCCCTTACCTTCTACCAACGTATCAAACGTGTAATCCGGAGACA 
GGTTGGTCTGCTCGTAACGCGCCTCTTCCGCATCGCGCTGCACGTCCGTCCGTGCTTTGG 
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CAACTGCCACCGATTCCGGCCGGGAAGCAGACCCGGCAGCCTGACGCGGCTCGTGCGGCA 
GGTTTTTCATACGTTCCGCCAAAATATCCGCCGCCGTTTTCGACGCAGCGGGTTTGACAG 
GCTCTTCAGACGGCAGCTCGTCCAACAGAACCTCCTGCACGGGCATTCCCTCCGACACCG 
CATGCAAGGACGGCTCGGCAGGTTCGACAGCACCTTCAACCGCCGCCATCTCATAACGCA 
5 CGCCTTCTCCCGGTTTGAATACGAAGGCGGAACGGCCGGCAGCCAACTCTTCCCTCACCG 
CTTCTATTTTTCCGGCAAACTGGCTCTTGAGCATATTGCAGGCAAACTGGTTCTTGCCGT 
ACACCACCCATACGCCACCCTCCTCACCAACGGTAAGGGGCGCAATCCATTGCGCAAACT 
GCCCTTGAGGCAACATATCGTGAAGACGGCGGAGGCACAGCGGCCAAAACTCTGCTAATG 
TCATGGATAGGCTCGAATCGGTAAAAATGAAATCGAAAACAAAGAAAATATAATATTTTC 
10 AAAAGGAAAACAAATCTGTTGAACGCACATCGGTTCAAAACGCGACTGCCCGATTATACC 
GACTCACGAATATTTTATCCACAACCCGTGCAAAAATTTATCCACAGAAAGGCGGCGGAA 
ATCCGCAGGCAATCGGGCAATCTTCCTGCAAAGTTTCTATATTGATTGACAAAAGCGGCA 
AATTGGAGTGTAATTCACGGTTTAATTATCTACCCATTCTATTTTAGGAAACATCATGAA 
ACGCACTTATCAACCTTCCGTTACCAAACGCiyiACGCACCCACGGCTTCCCTGGTGCGCT 
1 5 CCAAAACGCGCGGCGGCCGCGCAGTATTGGCCGCACGCCGTGCCAAAGGCCGCAAACGCC 
TGGCGGTATAATTTTGGACTACCGCTTCGGAAGGCAGTACCGCTTGTTGAAAACGGATGA 
TTTTTCATCCGTTTTTGCATTCAGAAACCGCCGCAGCCGCGACCTGCTGCAAGTTTCGCG 
CTCAAACGGCAACGGGCTGGGCCATCCCCGCATCGGTCTGGTGGTCGGCAAAAAAACCGC 
CAAACGCGCCAACGAACGAAATTATATGAAGCGCGTTATCCGCGACTGGTTTAGATTGAA 
20 CAAAAACCGGCTGCCGCCGCAGGATTTCGTCGTGCGCGTCCACCGTAAATTCGACAGGGC 
TACCGCAAAACAGGCAAGGGCGGAACTGGCACAACTCATGTTCGGCAACCCCGCAACCGG 
ATGCAGGAAACAGGCATGATCAGAACGGTACTCTGCAGGCAAGGTTCAGACGGCAACGGG 
TTTCCCATACAAGGAACATCCCGATGAACTTCCTATTGTCCAAACTCCTGCTGGGACTGA 
TACGGTTCTACCAATATTGCATCAGCCCGCTGATTCCGCCGCGCTGCCGTTATACGCCGA 
25 CCTGTTCGCAATACGCGGTCGAAGCGGTCAAAAAATACGGCGCATTCAAAGGCGGCCGGC 
TCGCCATCAAGCGCATTGCACGCTGCCACCCTTTCGGCGGACACGGACACGACCCCGTTC 
CCTGACCCGACGCAATATTCAAATTGCACGCTTTCCTTTTATTTCCCATCGGTTTCTATA 
TAATGCCGTCTGAAGCTTCGGGCAGGCGGCACGACCGCCGGGTATGAAGCCCGCCCTTAT 
TCCCCGTCTATCGGAACACGCAACCTGCGGCATTTCCGACCATTCAkGAAACTCTTATGG 
30 ATTTTAAAAGACTCACGGCGTTTTTCGCCATCGCGCTGGTGATTATGATCGGCTGGGAAA 
AGATGTTCCCCACTCCGAAGCCCGTCCCCGCGCCCCAACAGGCAGCACAACAACAGGCCG 
TAACCGCTTCCGCCGAAGCCGCGCTCGCGCCCGCAACGCCGATTACCGTAACGACCGACA 
CGGTTCAAGCCGTCATTGATGAAAAAAGCGGCGACCTGCGCCGGCTGACCCTGCTCAAAT 
ACAAAGCAACCGGCGACGAAAATAAACCGTTCATCCTGTTTGGCGACGGCAAAGAATACA 
35 CCTACGTCGCCCAATCCGAACTTTTGGACGCGCAGGGCAACAACATTCTAAAAGGCATCG 
GCTTTAGCGCACCGAAAAAACAGTACAGCTTGGAAGGCGACAAAGTTGAAGTCCGCCTGA 
GCGCGCCTGAAACACGCGGTCTGAAAATCGACAAAGTTTATACTTTCACCAAAGGCAGCT 
ATCTGGTCAACGTCCGCTTCGACATCGCCAACGGCAGCGGTCAAACCGCCAACCTGAGCG 
CGGACTACCGCATCGTCCGCGACCACAGCGAACCCGAGGGTCAAGGTTACTTTACCCACT 
40 CTTACGTCGGCCCTGTTGTTTATACCCCTGAAGGCAACTTCCAAAAAGTCAGCTTTTCCG 
ACTTGGACGACGATGCCAAATCCGGCAAATCCGAGGCCGAATACATCCGCAAAACCCCGA 
CCGGCTGGCTCGGCATGATTGAACACCACTTCATGTCCACCTGGATTCTCCAACCTAAAG 
GCAGACAAAGCGTTTGCGCCGCAGGCGAGTGCAACATCGACATCAAACGCCGCT^lACGACA 
AGCTGTACAGCACCAGCGTCAGCGTGCCTTTAGCCGCCATCCAAAACGGCGCGAAAGCCG 
45 AAGCCTCCATCAACCTCTACGCCGGCCCGCAGACCACATCCGTCATCGCAAACATCGCCG 
ACAACCTGCAACTGGCCAAAGACTACGGCAAAGTACACTGGTTCGCCTCCCCGCTCTTCT 
GGCTCCTGAACCAACTGCACAACATCATCGGCAACTGGGGCTGGGCGATTATCGTTTTAA 
CCATCATCGTCAAAGCCGTACTGTATCCATTGACCAACGCCTCTTACCGCTCTATGGCGA 
AAATGCGTGCCGCCGCACCCAAACTGCAAGCCATCAAAGAGAAATACGGCGACGACCGTA 
50 TGGCGCAACAACAGGCGATGATGCAGCTTTACACAGACGAGAAAATCAACCCGCTGGGCG 
GCTGCCTGCCTATGCTGTTGCAAATCCCCGTCTTCATCGGATTGTATTGGGCATTGTTCG 
CCTCCGTAGAATTGCGCCAGGCACCTTGGCTGGGTTGGATTACCGACCTCAGCCGCGCCG 
ACCCCTACTACATCCTGCCCATCATTATGGCGGCAACGATGTTCGCCCAAACTTATCTGA 
ACCCGCCGCCGACCGACCCGATGCAGGCGAAAATGATGAAAATCATGCCGTTGGTTTTCT 
55 CCGTCATGTTCTTCTTCTTCCCTGCCGGTCTGGTATTGTACTGGGTAGTCAACAACCTCC 
TGACCATCGCCCAGCAATGGCACATCAACCGCAGCATCGAAAAACAACGCGCCCAAGGCG 
AAGTCGTTTCCTAAATGCCGCAGCATGAAAAATGCCGTCTGAAACCTGTTCAGACGGCAT 
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TTTTATTGCCCACCCCCTATCGGGGCGGAAATCTTCAACCCGCATACATCACAAAAATCG 
TCGGGCGTTTTTTCAGATTGGGCATTTCTTTTCTTTTTCGCCACTGCACGATTGTTTGAC 
TGATGATTTCCTGTGTCGGCAAGGTCAAATCCGTAGCCGTGCATAAACGCGTTTCAGGAT 
GCAGGTTTTCCACCGCATCGGCAAGCAGCGCATCATTGCGGTAAGGCGTTTCAATAAAAA 
5 TCTGCGTCTCGCCGCACTGGCGCGAACGCTGTTCCAAAGCCCGAAAAGCCTGAATCCGCT 

CCATCAAAGCCAGCAGCAGGCTGGAAGGCCCGACCAGCGGACGCACTTCAAAACCGTGTT 
TATGCGCCAATGCCACCAAATTCGCACCCGGATCGGCCACAGCCGGGCAACCCGCCTCAC 
TGACAATGCCCATACTGCGCCCTTCTTGCAAAGGTTTCAGCAATTCCGGCAAAGTCTTCA 

1 0 AATCCGTATGTTCATTCAACGTTTGCAGATTCAGCTCGCGGATAGGCGTAGTCACGCCCA 
AATGTTTCAAATGCGCACGCGCCGTTTTTTCCGCCTCCACGACAAAATCCGTCAGCCCGA 
CAATCGCCTGTTGTTCATGCGGCAACAGGCACGGCGTGTCAGGCGTACCCAAAGGCGTAG 
GAATCAAATACAAAACAGGAGACATCATTCCCTCACTCATCGGTTAAAAATGCCGTCTGA 
GCCTTTCAGACGGCATAAACGGGCAGTTACAGAACCTCCACGCCCTCATTTTTCAAGAAA 

15 TCGACCAGACGGAAAACCGGCAAACCGATTAAAGCATTCGGATCGGTACTCTCAATCCTT 
TCAATCAGCAATGCACCCAAATCCTCACTCTTCAGCGCACACGAACAATAAACCGCATCA 
GGCTCGCGCTCCAAATAGCGGAGGATATGCAACTCGTCCAACTGCCTCATCACGACCACC 
GTCTTATCGATATGCCGCCGCATCCTGCCCGTAACCGTATTCAACAGCACGATCGCGCTG 
TAAAACTCAATCTCCCTGCCGCTCAAGTGCATCAGCATCTTTTGCGCATTGGCAAGGTTC 

20 ATCGGCTTGCCCCACTGCCTGCCGTCGCACCACGCCACCTGGTCCGCACCGACAATCAAC 
GCCTCTGGGAAACGCCCGGTCAACGACCGCGCCTTACCCTCGGCAAGGCGCAATGCCGTC 
TGAGGGGCGGATTCCCCCAACATCGGCGTTTCGTCAAAATCGGGGGACGCCGCCTGAAAG 
GCAATGCCGAGCCTTTCCATCTGTTCGCGGCGGAAAACCGAACTCGTACCCAAAATCAAA 
GGCAGTTCCAAACCCATCCCATCCTCCTTACCGTTGAAAACACGCCCGAAGGGGCAGTAA 

25 AATCCAGCCATGCGCCGAAACACGGATACCCGCCTTCGGCGTACCGCAACATTTTTCTTA 
AAAATATTGACGTTAGAACATCTAAATTATATCATATCCCGTTTATGTCAGACCCTAATT 
TGATTGACTTGGAAATTTTTGCCGCCGAAGGGCAGAACCTGCAAGGCAGTTTTCTGCTGG 
AAGAATTGGATGAACGCGTCAGTTCGCACGATTATCCCGCCGACAGGCAGACCAAAATAT 
CGTTTACACTGACCGGCGGTCGCGACCGGCTGCAACGCCTGTTCCTCGACCTGAACGTCA 

30 AAGCCGATATGCCCCTGATTTGCCAGAGATGTATCAAACCCATGCCGTTCATGCTTGATG 
AAAGCAGCCGTATCGTCCTGTTTTCCAACGAAGAGTCCTTGGACGAATCCATGCTTGCCG 
ACGAAGAACTCGAAGGCATACTGATTGAAAAAGAACTCGACGTGCGCACATTGGTAGAAG 
ACCAAATCCTGATGTCCCTGCCCTTTTCGCCGCGACACGAAGACTGCGGCGACAATGGGA 
CACTGGAAGAAGTCAATCGGGACAAACCCAACCCCTTTGCTGTTTTGGCAGGTTTGAAAA 

35 GCAATTGATTAGGACACAGTTTATTTATCTAGGAGCTTGAAATGGCCGTTCAACAAAACA 
AAAAATCCCCTTCCAAACGCGGTATGCACCGTTCGCACGACGCGCTGACCGCGCCTGCAC 
TGTCTGTCGACAGCACAACCGGCGAAGTACACCGCCCGCACCACATCTCCCCCAACGGTA 
TGTACCGCGGCCGCAAAGTGGTCAAAGCCAAAGGCGAATAATCCCTATTCGACTGACTGA 
AAAAGCCAGAACATTGCCATGCAATTACTGGCTTTTTTTGCATTGGACGCACCATCCGTC 

40 CAAACTTTCGCCATACGTCAACACACAGGGGCAAAGCGTTCCGTATAATACCCCQTGAAA 
ATATTCCAAAAGCCCCAACCACCAAGGAAATTCCGATGAAACAGAAAATCTGGTACACCT 
ACGATGACATCCACCGCGTCATCAAAGCATTGGCAGAAAAAATCCGGAACGCCGACATCA 
AATACGATGCCATGATTGCCATCGGCGGCGGCGGCTTTATTCCGGCACGTATGCTGCGCT 
GTTTTCTGGAAATTCCGATTTATGCCGTAACCACCGCCTATTACGACAGCGACAACGAAG 

45 GACAGGTTACCGAAGAAGTCAAAAAAGTCCAATGGCTCGACCCCGTTCCCGAAGCCCTGC 
GGGGCAAAAACGTACTCGTCGTCGATGAAGTGGACGACAGCCGCGTAACCATGGAGTTCT 
GCCTGAAAGAACTGCTCAAGGAAGACTTCGGTACGATCGGAGTCGCCGTACTGCACGAAA 
AAATCAAAGCCAAAGCAGGCAAAATCCCCGAAGGCATTCCCTATTTCAGCGGCATCACCG 
TAGAAGACTGGTGGATCAACTATCCGTGGGACGCACTCGACATCGACGAACACAACCGCC 

50 TTGCCGAGGCCGGCCGAGGCTGACCCTTTCAGACGGCATATTTTCCGAACCGATGCCGTC 
TGAAGCCCGCACGACCCCTGCCGCAGACCGAAAACCTACCGGAGAAACCCTATGATTACA 
TTGGCCGTAGATGCCATGGGCGGCGACCAAGGACTTGCCGTTACCGTACCCGGCGCAACC 
GCATTCCTCCAAGCACACCCCGATGTCCGCCTGATTATGACCGGCGACGAAACGCAACTG 
CGCCAAGCCCTGACCGCGGCAGGCGCACCGATGGAACGCATCGACATCTGCCATACCACC 

55 CAAGTCGTCGGCATGGACGAAGCCCCGCAATCCGCCCTGAAAAACAAAAAAGACTCCTCC 
ATGCGCGTCGCCATCAACCAGGTTAAAGAAGGCAAAGCCCAAGCCGCCGTATCCGCAGGC 
AACACGGGTGCGCTCATGGCAACCGCACGTTTCGTCCTCAAAACCATTCCCGGCATCGAA 
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CGCCCCGCCATCGCCAAATTCCTTCCTTCCGACACCGACCACGTTACCCTTGCACTCGAC 
CTTGGCGCGAACGTCGACTGCACGTCCGAACAGCTCGCCCAATTTGCCGTTATCGGCAGC 
GAACTCGTCCACGCACTCCATCCTCAAAAAGGACAGCCGCGCGTCGGGCTGGTCAACGTC 
GGCACGGAAGACATCAAAGGTACGGACACCGTCAAACAAACCTACAAACTGCTGCAAAAC 
5 AGCAAACTCAACTTTATCGGCAACATCGAAAGCAJVCGGCATCCTCTACGGCGAAGCAGAT 
GTCGTCGTCGCCGACGGCTTTGTCGGCAACGTCATGCTCAAAACCATCGAAGGCGCGGTC 
AAATTCATGAGCGGAGCCATCCGCCGCGAATTCCAAAGCAACCTGTTCAACAAACTTGCC 
GCCGTTGCCGCCCTACCCGCCCTCAAAGGGCTGAAAAACAAACTCGACCCGCGCAAATTC 
AACGGGGCCATCCTGCTCGGGCTGCGCGGCATCGTGATTAAAAGCCACGGCGGCACAGAC 
10 GAAACCGGTTTCCGCTATGCCCTCGAAGAAGCCTACCACGAAGCCAAGTCCGCCGGCCTT 
TCCAAAATCGAACAGGGCGTAGCCGAACAACTCGCCGCACTCGAAACTGCCAAAGCCGTC 
CAAAACGAAAATGTCGGCGGTCTGTAACACACACGATGCCGTCTGAACGCCCCCGCCCCT 
TTCAGACGGCATCCGCCCGCACCAAACCTGCGGGCGCGGACGGCGATGCGCCTGTCCGGC 
ACTTCCCAAATATCGCCTTGTAAAATAAGGAGTATTTGAAAAATGAAGACATTAGAAAAA 
1 5 CGGATGAAAGCTCTAGACAAACGGATTATGAAGTTCGGAAAATCCCTTGAAGGCAGGCTT 
GATGCCCGTCTGATTGAATCCGCATTGGATTATATTCATTATTCGGAACGTTTTTTGGCT 
TTTGAAATCCTGTGTACTTATATCGAAGATTTCGATGTCCGGCTGACGGAACAAGAATCC 
CGGGAAATTTCTTTTATCAACAAGGAATTTGAGATAGAAAGCACGTCCGATTAACCAATA 
AAGCCAATGGGTTGATAAACATGAAAACATCGACGGTCGTTTTTGGCGGATTTTTTATGG 
20 CAGACAACGGAGAGCGAATCCAAATCCCCGTTTTGGTWW^TCCTGACATTAGGGAAATCA 
ATCACTTTTTTTCCGTATCAAATTTTGAGAAAAAAACCGGCGTCCTTGTTTTCAGAATCA 
TCCCCGAGCCGGAATTTGGCAATACCGAATTAACTGTCTATTTTAAAAAAGGATATTATA 
GTGGATTAACAAAAACCAGTACGGCGTTGCCTCGCCTTGCCGTACTGGTTTTTGTTAATC 
CACTATATCAGACGAAAACAAACACCCGCGCCAATAGCCTGACGGCAACCCGGCAATCAA 
25 AATGCCGTCTGAAGCAGCTTGGGCTTTCAGACCGCATTTCCTTCGCTTAAAACAGCGTAT 
CGGCAACCCCGCCCTGCCTGTCCACGGCAATCTGCATCTGAAAACCATCTGTATCCCAAA 
CCACACCCCCATCCCTGTTTCCATCATGTGCACCCTGTCCGTATTGGGCAATCATCTGTT 
TTTCGCTTACAATAGCCGAATCTGAACCAACTCTCTAAAAAGGCCGTTCCCATGCAGTAT 
GCAAAAATTTCCGGCACAGGCAGCTATCTTCCCGCCAACCGCGTCAGCAATGACGACCTT 
30 GCCCAAAAGGTAGATACCTCTGACGAGTGGATTACCGCGCGCACGGGCATCAAATTCCGC 
CATATTGCAGCCGAAAACGAAAAAACCAGCGATCTTGCCGCCGAAGCGGCGCACCGCGCG 
CTGGATGCAGCCGGATTAGACAGCGGCGAAATCGATTTGATTATCGTGGCAACGGCAACG 
CCGGATATGCAGTTTCCGTCTACTGCGACCATCGTGCAACAAAAATTGGGCATCACCAAC 
GGCTGCCCCGCGTTTGACGTACAGGCGGTGTGCGCCGGCTTTATGTACGCGCTGACCACG 
35 GCAAACGCCTACATTAAAAGCGGTATGGCGAAAAACGCGCTGGTCATCGGCGCGGAAACC 
TTCAGCCGCATTGTAGACTGGAACGACCGCACAACCTGCGTATTGTTCGGCGACGGCGCG 
GGCGCGGTGGTTTTAAGCGCGTCGGACACGCCGGGCATCATCCACAGCAAACTCAAGGCC 
GACGGCAATTATCTGAAACTCTTAAACGTCCCCGGGCAAATCGCCTGCGGCAAAGTTTCC 
GGTTCGCCGTACATTTCGATGGACGGTCCCGGCGTGTTCAAGTTTGCCGTCAAAATGCTG 
40 TCCAAAATCGCCGATGACGTTATCGAAGAAGCAGGTTACACCGCCGCTCAAATCGACTGG 
ATTGTTCCCCATCAGGCAAACCGCCGCATTATCGAATCGACCGCGAAACATTTAGGTTTG 
AGTATGGACAAAGTCGTCCTGACCGTCCAAGACCACGGCAACACATCCGCCGCATCGATT 
CCGCTGGCTTTGGATACGGGCATCCGCAGCGGACAAATCAAACGCGGTCAAAACCTGCTG 
CTCGAAGGCATCGGCGGCGGTTTCGCGTGGGGCGCGGTGCTGTTGCAATATTGAACCCGA 
45 TGCCGTCTGAAACAGGCTTTCAGACGGCATTTCCCATATCATGAAGCGGCAGGCTTTCTT 
CAAACTGATGGCGTGTGCGGCATTTCTGTCTGCCGTTTCGCTGCGCCTCCCCGTATTGGG 
CGCGTGTTACGCAATATTGTCCCTCTATGCGTTTGCACTTTACGGCATCGACAAACGGTG 
CGCCATACGGGGGCAACGCCGCATTCCCGAACACCGCCTGCTGCTGCCTGCATTGCTCGG 
CGGCTGGGTGGGCGCGTATTTCGGCAGCATGACATTCAAACATAAGACAGCGAAAAAGCG 
50 TTTTGTTGTGCTGTTCCGTCTGACTGTTTCAGGTAATGTCTTGGCGACCCTCATCCTGAT 
TTATAGTGGATTAAATTTAAACCAGTACGGCGTTGCCTCGCCTTGCCGTACTATTTGTAC 
TGTCTGCGGCTTCGTCGCCTTGTCCTGATTTTTGTTAATCCACTATATTATTTTGTCCCG 
CCTGAATTTTTCGTAAAACTCGGGCAGAATACCTGATTATCCAACCAAACAAAGGAATAC 
TATGTCTTTTGCCTTCTTTTTTCCCGGACAAGGTTCCCAAAGCCTCGGTATGATGAACGG 
55 CTTTGCCGAACACGCCATCGTCAAAAACACCTTTGCCGAAGCCTCCGCCATATTGGGGCA 
GGACTTGTGGGCGATGATAAACGGCAGCGATGCCGAAATCATCGGTCAAACCGTCAACAC 
CCAGCCCATTATGCTCGCCGCCGGCGTTGCCGTTTACCGCGCCTATTTAGAAGCGGGCGG 
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CAAAACGCCTGCCGCCGTTGCCGGACACAGCCTCGGCGAATACACCGCACTCGTTGCCGC 
CGGCGCATTGAATTTTGCCGACGCGGTCAAACTCGTGCGCCTGCGCGCCGAACTGATGCA 
GTCCGCCGTACCGCAAGGCGTGGGCGCAATGGCGGCGATTCTCGGCTTGGAAGATGAGCA 
GGTTAAAGCCATTTGTGCCGAAGCCGCCCAAAGCGAAGTGGTCGAAGCCGTCAACTTCAA 
5 CTCACCCGGACAAATCGTGATTGCAGGCAACGCCGCCGCCGTCGGACGCGCCATGGCTGC 
CGCCAAAGAAGCCGGTGCCAAACGCGCCCTGCCGCTGCCCGTGTCCGTACCTTCCCATTG 
CAGCCTGATGAAACCCGCCGCCGACAAACTTGCCGAAGCCCTGAAAACCGTTGAAATCAA 
GCAGCCGCAAATCCGCGTTATCCACAACGCCGACGTTGCCGCCTACGATGATGCCGACAA 
AATCAAAGACGCGCTCGTCCGCCAGCTTTACAGCCCCGTACGCTGGACGGAAACCGTCAA 
10 CGCCCTCGTTTCAGACGGCATTGCCGAATCCGCCGAATGCGGCCCGGGCAAAGTGTTGGC 
GGGCTTGGCAAAACGCATCAACAAAGCCGCCGCGTGCAGCGCACTGACCGATGCCGGACA 
GGTTGCCGCCTTTATCGAAGCGCACTGACTTCGTTCTGCAAAAAGCAGCCTGCCCTCTTC 
AGGCTGCTTTTCATGTCCGAACGACGGCTiGCCCCATATTTACGCTATAATCCATCCCGAC 
CAAACCACCGACAGCGGCTGCCGTTGCAGTTCCCGCCCTACCGATATGATAGAAAAACTG 
15 ACTTTCGGACTGTTTAAAAAAGAAGACGCGCGCAGCTTTATGCGCCTGATGGCGTACGTC 
CGCCCCTACAAAATCCGCATCGTTGCCGCCC7GATTGCCATTTTCGGCGTTGCCGCCACC 
GAAAGCTACCTTGCCGCCTTCATCGCCCCCCTGATTAACCACGGCTTTTCCGCACCTGCC 
GCGCCGCCCGAGCTGTCTGCCGCCGCCGGCATCATTTCCACCCTGCAAAACTGGCGCGAA 
CAGTTTACCTATATGGTTTGGGGGACGGAAAACAA^ATCTGGACCGTCCCGCTCTTCCTC 

20 ATCATCCTCGTCGTCATCCGTGGCATCTGCCGCTTTACCAGCACCTATCTGATGACTTGG 
GTCTCCGTGATGACCATCAGCAAAATCCGCAAAGATATGTTTGCCAAAATGCTGACCCTT 
TCCTCCCGCTACCATCAGGAAACGCCGTCCGGCACCGTACTGATGAATATGCTCAACCTG 
ACCGAACAGTCGGTCAGCAACGCCAGCGACATCTTCACCGTCCTCACGCGCGACACGATG 
ATCGTTACCGGCCTGACCATCGTCCTGCTTTACCTCAACTGGCAGCTCAGCCTCATCGTC 

25 GTCCTGATGTTCCCCCTGCTCTCCCTGCTCTCGCGCTACTACCGCGACCGTCTGAAACAC 
GTCATTTCCGACTCGCAAAAAAGCATAGGCACGATGAACAACGTGATTGCCGAAACCCAT 
CAGGGACACCGCGTCGTCAAGCTGTTCAACGGGCAGGCGCAGGCGGCAAACCGGTTCGAC 
GCGGTCAACCGCACCATCGTCCGCCTCAGCAAAAAAATCACGCAGGCAACGGCGGCACAT 
TCCCCGTTCAGCGAACTGATCGCCTCGATCGCCCTCGCCGTCGTCATCTTCATCGCCCTG 

30 TGGCAAAGCCAAAACGGCTACACCACCATCGGCGAATTTATGGCATTCATCGTCGCGATG 
CTGCAAATGTACGCCCCCATCAAAAGCCTTGCCAACATCAGCATCCCTATGCAGACGATG 
TTCCTCGCCGCCGACGGTGTATGTGCATTTCTCGACACCCCGCCCGAACAGGACAAGGGC 
ACGCTCGCACCGCAGCGTGTCGAAGGGCGCATCAGCTTCCGCA/iCGTCGATGTCGAATAC 
CGTTCAGACGGCATCAAAGCCCTCGACAACTTCAACCTCGACATCAGACAAGGCGAACGC 

35 GTCGCCCTGGTCGGACGTTCCGGCAGCGGCAAATCCACCGTCGTCAACCTGCTGCCCCGC 
TTTGTCGAACCGTCTGCCGGCAACATCTGCATAGACGGTATCGACATCGCCGACATCAAA 
CTCGACTGCCTGCGCGCCCAATTCGCCCTCGTCTCCCAAGACGTATTCCTGTTTGACGAC 
ACCCTGTTTGAAAACGTCCGATACAGCCGTCCCGACGCGGGCGAAGCCGAAGTCCTGTTC 
GCCCTCCAAACCGCCAACCTGCAAAGCCTGATTGACAGCTCCCCGCTCGGACTGCACCAG 

40 CCCATCGGATCGAACGGCAGCAACTTATCCGGCGGACAGCGGCAACGCGTCGCCATTGCC 
CGCGCCATTTTGAAAGACGCGCCGATATTATTATTGGACGAAGCCACCAGCGCATTAGAC 
AACGAATCCGAACGCCTCGTCCAACAGGCGCTCGAACGCCTGATGGAAAACCGCACCGGC 
ATCATCGTCGCCCACCGCCTGACCACCATCGAAGGGGCCGACCGCATCATCGTGATGGAC 
GACGGCAAAATCATCGAACAAGGCACACACGAACAACTGATGTCCCAAAACGGTTACTAC 

45 ACGATGTTACGCAATATCTCAAACAAAGATGCCGCCGTCCGGACGGCATAAACAAAATGC 
CGTCCGAAATGGTACAATCGCCCCGACCCTTTCAGACGGCATCATATCCGCCGACCCATC 
CGATTATCTTCAATCACTGTAAAACCCATTATGACCCAAGACAAAATCCTCATCCTTGAC 
TTCGGTTCGCAAGTTACCCAGCTCATCGCCCGCCGCGTGCGCGAAGCCCACGTTTACTGC 
GAGCTGCATTCTTTCGATATGCCTTTGGACGAAATCAAAGCCTTCAACCCCAAAGGCATC 

50 ATCCTCTCCGGCGGCCCCAATTCCGTTTACGAATCCGACTATC?J^GCCGATACCGGTATT 
TTTGATTTGGGCATTCCGGTTTTGGGCATCTGTTACGGCATGCAGTTTATGGCGCACCAC 
TTGGGCGGCGAAGTGCAGCCCGGCAACCAGCGCGAATTCGGTTATGCGCAAGTTAAAACC 
ATAGACAGCGAGCTGACACGCGGCATTCAAGATGGTGAGCCAAACACACTCGACGTATGG 
ATGAGCCACGGCGACAAAGTGTCCAAACTGCCCGACGGTTTCGCCGTCATCGGCAACACC 

55 CCGTCCTGCCCGATTGCCATGATGGAAAACGCCGAAAAACAATTCTACGGCATCCAGTTC 
CACCCCGAAGTTACCCACACCAAACAAGGCCGCGCCCTGTTGAACCGCTTTGTCTTGGAT 
ATTTGCGGCGCACAACCGGGCTGGACGATGCCGAACTACATCGAAGAAGCCGTTGCCAAA 
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ATCCGCGAACAGGTCGGCAGCGACGAAGTGATTTTAGGTCTGTCCGGCGGCGTGGACTCT 
TCCGTAGCCGCCGCGCTGATTCACCGCGCCATCGGCGACCAACTGACCTGCGTGTTCGTC 
GATCACGGTTTGTTGCGCCTGAACGAAAGCAAAATGGTGATGGATATGTTCGCCCGCAAC 
TTGGGTGTGAAAGTGATACACGTCGATGCCGAAGGGCAGTTTATGGCGAAACTCGCCGGC 
5 GTAACCGACCCCGAGAAAAAACGCAAAATCATCGGTGCGGAATTTATCGAAGTATTTGAT 
GCCGAAGAAAAAAAACTTACCAACGCCAAATGGTTGGCACAAGGCACGATTTACCCTGAC 
GTAATCGAATCCGCAGGTGCAAAAACCAAAAAAGCCCACGCCATCAAATCGCACCACAAC 
GTCGGCGGCCTGCCTGAAAACATGAAGCTCAAATTGCTTGAGCCTTTGCGCGATTTGTTC 
AAAGACGAAGTACGCGAATTGGGTGTGGCTTTGGGCCTGCCGCGCGAAATGGTGTACCGT 
1 0 CATCCGTTCCCGGGTCCGGGTTTGGGCGTGCGTATTTTGGGCGAAGTGAAAAAAGAATAT 
GCCGACCTGCTTCGTCAGGCAGACGATATTTTCATTCAAGAATTGCGCAATACTACCGAT 
GAAAACGGTACATCTTGGTACGACCTGACCAGCCAGGCATTCGCCGTGTTCCTGCCCGTC 
AAATCTGTCGGCGTAATGGGCGACGGCCGCACATACGATTACGTCATTGCCTTGCGTGCC 
GTGATTACCAGCGACTTTATGACCGCGCATTGGGCGGAACTGCCGTATTCCTTGTTGGGC 
1 5 AAAGTGTCCAACCGCATCATCAACGAAGTCAAAGGCATCAACCGCGTGGTTTATGATGTG 
AGCGGCAAACCGCCTGCCACCATCGAGTGGGAATAAACAGCAAACATGGCTGCCCCGTCC 
GGCGCAGTCCTTCGATTATCGGAAAAAAGGAAAAAATATGAGCACACAAGATTTAAACGG 
CAAAATCGCTTTGGTAACAGGCGCATCGCGCGGTATCGGTGCAGCAATTGCCGACACGCT 
GGCGGCAGCCGGTGCC/IAAGTCATCGGTACGGCGACCAGTGAGAGCGGTGCGGCGGCGAT 

20 TAGCGAGCGGTTGGCGCAATGGGGCGGCGAAGGCCGCGTATTAAATTCCGCCGAACCTGA 
AACCATCGAAAGCCTGATTGCCGACATCGAAAAAGCGTTCGGCAAACTCGATATTCTGGT 
CAACAACGCCGGCATCACCCGCGACAACCTCCTGATGCGCATGAAAGAAGAAGAGTGGGA 
CGACATCATGCAGGTCAACCTCAAATCCGTGTTCCGCGCTTCTAAAGCCGTTTTGCGCGG 
TATGATGAAACAACGTTCCGGCCGCATCATCAACATCACATCCGTCGTCGGCGTGATGGG 

25 CAATGCCGGTCAAACCAACTATGCCGCGGCAAAAGCAGGCTTAATCGGTTTCTCCAAATC 
CATGGCGCGCGAAGTCGGCAGCCGGGGCATTACCGTCAACTGCGTCGCCCCTGGCTTTAT 
CGATACCGACATGACACGCGCCCTGCCGGAAGAAACCCGCCAAACCTTTACCGCCCAAAC 
CGCCTTGGGCAGATTCGGCGACGCGCAAGACATCGCCGATGCGGTTCTGTTCCTCGCTTC 
CGACCAAGCAAAATACATCACCGGCCAAACGCTGCACGTCAACGGCGGTATGCTGATGCC 

30 TTAACAGACAACTTTTTCAACCATGCCGTCTGAAGCCCTTTCAGACGGCATTTGCATTCT 
CAGGCAAAATGAACACACACCACACCCCGCCCTGCCCATGCGGCTCAGGCACAAGCTGAG 
ACCTTTGCAAAATTCCTTTCCCTCCCGACAGCCGAAACCCCAACACAGGTTTTCAGCTGT 
TTTCAGCTGTTTTCGCCCCAAATACCGCCTAATTCTACCCAAATACCCCCTT7VATCCTCC 
CCGGACACCTGATAATCAGGCATCCGGGTCACCTTTTAGGCGGCAGCGGGCGCACTTAGC 

35 CTGTTGGCGGCTTTCAAAAGGTTCAAACACATCGCCTTCAGATGGCTTTGCGCACTCACT 
TTAATCAGTCCGAAATAGGCTGCCCGGGCGTAGCGGAATTTATGGTGCAGCGTACCGAAG 
CTCTGTTCGACCACATATAGTGGATTAACAAAAACCAGTACGGCGTTGCCTCGCCTTAGC 
TCAAAGAGAACGATTCTCTAAGGTGCTGAAGCACCGAGTGAATCGGTTCCGTACTATTTG 
TACTGTCTTCGGCTTCGTCGCCTTGTCCTGATTTTTGTTAATCCACTATACATCATCGCT 

40 ACTACCGTTCCGGCGCAACAGGCATTCCTCGATGCCGCCGAACTGATGCAATGGAGTATA 
GAAACCGAAGGGCTGGGCTTGAACGTCATCTCGCACAAGATACTCGGCAAAGACCACGCC 
CAAGTCGAATTTGAAGCCTACTTCCGAGACGGACAACACCGATCCGCGCATCACGAACTG 
TCCGGCTTCGTCAACATCGGCGGACAATGGTATTTTATCGATCCCACCGTTCCGCATCCT 
GCGATGAAACAACCCTGCATTTGCGGATCAGGCAAAAAATTCAAAGCCTGCTGCGGCAAA 

45 TATCTGAAACCTGTCGCATAAAAATGCCGTCTGAACGTTCAGACGGCATTTTCAACGTGC 
AAAAAAAACCATTCATACCAAGGGTAAGTATGAATGGTCAATACATTGCGGGAAAACGTC 
TTACTTGCTGCACTGCCGAAAAGGGAGAAACGGCAGCGGTAATCAGCGGAAAGGATTGTA 
CCCGAATTAATATTAAGAAACGTTAATCGCGAAAATATATTAACAAACCTGTTGAAACCT 
ATTGGTTTTCCCGTATCCACCCGACCCAGCGTTCAAACAGCTTCGGTTCGAGCGCGGCAA 

50 CGACCGAGCGTTTGAACACGTGTTCACCACTCCAAAACCCGTCGCCTTCC.ZUVAGTCGTCA 
GCCTGCCGCCCGCCTCCTCGAAAATCAACGCGCCGGCGGCATAATCCCACAGCTTCTGCC 
CGCCGTGAACATAAACATCATAACGCCCGCACGCCAGATAACACCAATCCAACGTACTGC 
TGCCCATACTCCGTATCGTTCCAAAAGGCGCGAGCGTACTCATACGGCTGGAAAGTTTGC 
CCGAACGCAGATATTTGATTTCCACGCCCGCAATCGCCTCATTGAGTTTTTTATCCACGA 

55 GGCGCAGGGGCAGACGCGTCCCGTTTAAAAACGCCCCCTGCCCGCGTTCGGCATAAAAAC 
ATTCGCCGCTGACTGGGTTGTAGATTACGCCCAACTCGGCGCGCCCGTTGCGGACAAACG 
CCACCGATACCGCAAAATGCGGCAGCCCGTTGACAAAATTGTTCGTCCCGTCTATCGGAT 
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CGACAATCCACAGCCCCTTTTCCCCCGAATATTGTTCCCACAAAGCCGACTGTTCCTGCC 
GCGACATTTCCTCACCCAACATCGGACTGTCGATTAAAAGCGGCAACGCGGCGGCAAAAG 
CCGTCTGCGCGGCAATGTCCGCCTCGCTCAACATCGAACCGTCTTCCTTGCGGTGAGACG 
GCGTATTCAAAAAACGCGGCATAATTTCGGTTTGCGCGATATGGCGCACGACTTTCTGCA 
5 AACGGTGTAACACTTCCTACTGTCCTCATATTTTGAACTTGCGGCGCGCGAACGTATAAT 
GTCCGCTTCCATCACGCCGCTGCGACGGATTATAACCGTCCGAACCGCCAAAAACTATGC 
CCCGATTCCACCTGCCCGAAAACCTTTCCGTCGGACAAACCGTCGCCCTGCCCGACAACA 
TCGTCCGCCACCTCAACGTCCTGCGCGTCCGCCCCAACGAAAACATCACCCTCTTCGACG 
GCAAAGGCAAGGCACACGCCGCACGGCTGACCGTTTTGGAAAAACGCCGCGCCGAAGCCG 
1 0 AAATCCTGCACGAAGACACAACCGACAACGAGTCCCCGCTCAACATCACACTGATACAAT 
CCATCTCCTCCGGCGATCGCATGGATTTCACCCTGCAAAAAAGCGTCGAACTCGGCGTAA 
CCGCCATACAGCCCGTCATCAGCGAACGCTGCATCGTCCGCCTCGATGGGGAACGCGCCG 
CCAAACGCCTCGCACGCTGGCAGGAAATCGTCATCTCCGCGTGCGAACAAAGCGGCAGGA 
ACACCGTTCCCCCCGTACTGCCCATCATCGGCTACCGTGAAGCACTCGACAAAATGCCGT 
1 5 CTGAAAGCACCAAGCTGATTATGAGCATCAACCGCGCCCGCAAACTCGGCGACATACGCC 
AACCGTCCGGCGCAATCGTCTTTATGGTCGGGCCCGAAGGCGGCTGGACAGAACAGGAAG 
AACAACAGGCATTTGAAGCTGGCTTTCAGGCGGTTACACTCGGCAAACGGATTTTACGCA 
CAGAAACCGCCCCACTCGCCGCCCTCGCCGCCATGCAGACGCTTTGGGGCGATTTCGCAT 
AAACAGAAATGCCGTCTGAAACCCGTTCAGACGGCATTTTGCAGCCGATTAAGATAGTAG 
20 GTTC7\AATAAGATTTCCCGTGTCGTCATTCCCGCGAAAGCGGGAATCTAGAAACGAAAAA 
CTACAGAGATTTATCCGAAACAACAACCCTCTCCGCCGTCATTCCCGCAAAAGCGGGAAT 
CTAGAAACGAAAAACTACAGGGATTTATCCGAAACAACAAACCCTCTCCGCCGTCATTCC 
CGCGCAGGCGGGAATCTAGAAACGAAAAACTACAGGGATTTATCCGAAACAACAAACCCT 
CTCCGCCGTCATTCCCGCGCAGGCGGGAATCTAGAAATTTAACGTTGCGGTGATTTATCG 
25 GAAATGACTGAAACTCAACGGACTGGATTCCCGCCTGCGCGGGAATGACGAGATTTTAGG 
TTTCTGTTTTTGGTTTTCTGTTCTCGCGGGAATAACGGAATTTTAAGTTTTAGGAATTTG 
TCGGAAAAACAGAAATCCCCCCGCCGTCATTCCCGCAAAAGCGGGAATCTAGAAACGAAA 
AACTACAGGGATTTATCCGAAACAACAAACCCTCTCCGCCGTCATTCCCGCGAAAGCGGG 
AATCTAGAAATTTAACGTTGCGGTGATTTATCGGAAATGACTGAAACTCAACGGACTGGA 
30 TTCCCGCCTGCGCGGGAATGACGAATTTTAGGTTTCTGTTTTTGGTTTTCTGTTCTCGCG 
GGAATAACGGAATTTTAAGTTTTAGGAATTTATCGGAAAAACAGAAATCCCCCCGCCGTC 
ATTCCCGCGAAAGCGGGAATCTAGAAATTTAACGTTGCGGTGATTTATCGGAAATGACTG 
AAACTCAACGGACTGGATTCCCGCCTGCGCGGGAATGACGAATTTTAGGTTGCTGTTTTT 
TGGTTTTCTGTTTTTGCGGGAATGACGAATTTTAGGTTTCTGTTTTTGGTTTTCTGTTCT 
35 CGCGGGAATAACGGAATTTTAAGTTTTAGGAATTTGTCGGAAAAACAGAAATCCCCCCAC 
CGTCATTCCCGCAAAAGCGGGAATCTAGAAATTTAACGTTGCGGTGATTTATCGGAAATG 
ACTGAAACTCAACGGACTGGATTCCCGCCTGCGCGGGAATGACGAAGTGGAAGTTACCCG 
AAACTTAAAACAAGCGAAACCGAACGGACTAGATTCCCGCCTGCGCGGGAATGACAGTGT 
ATCCATTTCTAATTTTAATCCGCTATATTTTACACAAACTATTTGAACGATATGACCCGC 
40 CTGCCGTAAGCTTTCTCAAGCTCCGCCTGCCTTTGACGCTCCATTCTTTTCTTCTTTTCC 
CTACCGAATTTACCCAAAGCACGCTTCAAGTCAAACATCACCTTCAACGAACGGCGGTGT 
CTTCTTTCTTGTTCCCTATCTTTTTCCAAATCGCTACCCAACATACTGTTTTTACTGAGG 
AACTTGGCATAATGCAATTCTTGGGTACATAAGGCGGGATTAACCTGATAAACAGGCATC 
CCCTCCTTATCAAAGAAATAAGTAAACATCATCCAATCTACCGCTTTAATCCACTCTGCC 
45 GGCAAAACGGCAAACCTTTCCAAGAAAAACCGCATCGCCTCACGCGAAATGATATAGCCA 
GCCGTCCCCCAATGTTCGCTCTCCAGCAAAGGAAATGACCGATTCTCATAATTCAGGACT 
TTATCCGGTCTGACAATAACTTTCGCAAACATCGTTTCCAAACGAACGATAAAGGCAGAA 
TCCTTATCAAAACGCTCTTCCAACCAAGTATCTTCGGCAAGGAACTTTTCTGCGTCTTTG 
CCAAGCAGGACATCATCCTCAAATACGGCAACATAGGGCAGACCTTCATCCAATGCCTGT 
50 TTCCACAATACGGCGTGGCTCATAAAGCAGGCTTTTTCCACTTCGCTCAACAGGTGCTGT 
TTTGCCAATCCCGGCACCAATTCCGCCATCATCCGATTCAGTTCTTCAGACGGCATCAGT 
GCGTCG7VAAAACTGAAACGGGATGCCGCGCACGCCGAAGGTTGCGGCAATGTGCGCCCTG 
CGTTCTGCGGCGGAAGCTAAGCTGATAACATGGTTTTGCATAATTTATCCTGTTTTTTGT 
CTGTTGGATAAAGCGGCGTTTTTCAACGGTTTTTCAGCAATCGGCGCAAAATGCCGAAGT 
55 ATTGCCTCAAGGTAAACAGCCGCCGCATCCTGCCGTCTGCTGCAAATACGATGTCCATCT 
CTCCTCCTTTTATTGGAAAGGCACAATGAACTGTTCGCGCCTTTGCCGGCGTTTTTCCCT 
TTCCCTGCTGATTTTGGTCAAGGCGCGGATCAGGCGGTGTTTGAATGTGTTGGCGGGGGA 
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ATCGCGCCTTTGCTGTTTGCGGTrCAGGAGGCGGTCGTGTTCGATCAGGCTGCCCAATGC 
GCTGTTTTGGTCGTGAAACTTGGCATAATGCAGCTCTTGGGCGCACAAGGCGGGATTGAG 
CTGGCAAACCGGCATTCCTTCCCT3TCGAAAAAATCGCTGAACATCATCAGATCGACGGG 
GTGCAGCCCTTCGGGCGGCAGGGCGGCAAACCTGTCCAGGAAAAACCGCATCGCTTTTCG 
5 GGAAATGATATAGCCCGCCGTCCCCCAGTGTTCGCTTTCCAACAGCGGA7\AGGCGCGCCC 
GCAGTAATCCGCCACGCCGGAGGGCGAGGTCAGGACGTGCATAAACATCGTTTCCAAGCG 
GACGATAAAGGCGGTATCCGGGTCAAAGCGTTCTTGCAGCCAAGCGTCTTCGGCAAGGAA 
TTTTTCCGCACCTTCGCCGAGTAAAACGTCGTCCTCAAATACGGTGATATACGGCAGACC 
TTCGTCCAATGCCTGCTTCCACAATACGGCGTGGCTCATAAAGCAGGCTTTTTCCACTCC 
1 0 GCTCAAATAGGGGTGCGCCGACAAGCCGGGGACGAGTTCCGCCATTGCCTGTTCCAGCCT 
TTCAGACGGCATCAGTGCGTCGAAAAACTGAAACGGGATGCCGTGCCTGCCGAAGGTATC 
GGCAATGTGCGCCCTGCGTTCTGCGGCGGAAGCTAAGCTGATAACGTGGTTTTGCATAAT 
TTATCCTGTTTTTTGTCTGTTGGATAAAGCGGCGTTTTTCAACGGTTTTTCAGCAATCGG 
TGCAAAATGCCGAAGTATTGCCTCAAGGTAAACAGCCGCCGCATCCTGCCGTCTGCCGCA 
1 5 AAATCCAGCCACGCGCCGGCGGGCAGCGTGTCCGTCCGTTTGAAGCATTGGTACAAAAAC 
CGGCGGGCGCGTTCAAAATCTTCTTCCGGCTyyVTGTTTCTCCAGCAATTCATACGCTACT 
GCTTTTATTTGGCGGTATTCAAGGCTGTCGAACCGGGTTTTAAAACCCATAGACTGCAAA 
AAATCGTTTCTGGCGGTTTTTTGGATGCCTTGCGCGATTTCGTGTTGGCGGATGCTGTAT 
TTGGATGAAACCTGATTGGCGTGA.;\GGCGGTATTTGACCAAGGCTTCGGGATAATAAGCC 
20 AGCCTGCCCAATTTGCTGACATCGTACCAAAATTGGTAATCTTCCGCCCAATCCCGCTCG 
GTGTTGTAACGCAAACCGCCGTCAATGACGCTGCGCCTCATAATCATCGTGTTGTTGTGT 
ATGGGGTTGCCGAAAGGGAAAAAGTCGGCAATGTCTTCGTGTCGGGTCGGTTTTTTCCAA 
ATTTTGCCGTGTTCGTGGTGCCGCGCCAGCCGGTTGCCGTCCTTTTCTTCCGACAAAACT 
TCCAGCCACGCACCCATCGCGATGATGCTGCGGTCTTTTTCCATCTCACCCACGATTTTC 
25 TCAATCCAGTCGGGGGCGGCAATATCGTCTGCATCGGTGCGCGCAATATATTCCCCCCCC 
CCCCCCGACTTTGCCAATTCATCCAGCCCGATGTTTAAAGAGGGAATCAGACCGGAATTG 
CGCGGCTGCGCGAGGATGCGGATGCGGCCGTCCTGTTCTTGGAAACGCTGGGCAATGGCA 
AGCGTACCGTCCGTCGAGCCGTCATCGACAATCAAAATATCCAAGTTGCGCCAAGTTTGA 
TTCACGACGGCGGCTAATGATTGGGCGAAATATTTTTCTACGTTGTAGGCGCAAATCAAT 
30 ACGCTGACTAAAGGCTGCAATTTATTCTCCCGATAGGCACGATGCCGTCTGAAGGCTTCA 
GACGGCATTTGGACTGTACAACGGTTACTCGCCCAAAAGCGCGATATCCGCTACCGCGTT 
CATTTGTTCTGCCAAGCGGTTCAGCAGGTTCAGGCGGTTTTGTTTCACGGCGGCATCTTC 
CGCCATCACCATCACGCCGTCGAAGAAGGCATCGACTTGCGGTTTGACGGAAGCCAGTTC 
GGACAAGGCGGTCTGGAAATTGCCTTCGGCAACGGCG3CGGCAATTTTCGGCTGCAAGCC 
35 TTGTGCGGCGGCAAAGAGGGCTTTTTCTTCGTCCTGTTGCAGCAAGCTTTCGTTAACCGC 
GCCCAACTCGGCATCGGCTTTTTTCAGCAGGTTTTGCACGCGTTTGTTGGCAGCGGCGAG 
CGCGGCGGCTTCGGGCAGTTGTTTGAACGCGGCGACAGCCTGCAGTTTGGCGGTCAAATC 
GTCCAAACGGCGCGGCTGCTTGGCAAGTACGGCGGCAACGATGTCTTGCGGATAATCGTT 
TTGCAGCAATACGGCAAGGCGCGCCTGCATGAAGTCGGCGGTTTCAGACGGCGTTTTTTC 
40 GTTGAGCAAACCTTGCGGGAAGCTGTTGAAGGCCGTCTGAATCAGTTCGTTTACGTCCAA 
ACCGTACTGCATCAGCATACGCAAAATACCCAATGCGGCGCGGCGCAGGGCGTATGGGTC 
TTTGTCGCCGGTCGGAATCAGGCCGATACCCCAAATGCCGACCAAGGTTTCCAGTTTGTC 
GGCAAGCGCAACGGCGGCGGCAATTTTGCCCTCAGGCAGGTTGTCGCCGGCAAAACGCGG 
TTGGTAGTGTTGCTCGACGGCTTCGGTAATTTCTTCGGTTTCGCCGTCCAAGCGGGCGTA 
45 GTATTTGCCCATCGTGCCTTGCAGTTCGGGGAACTCGCCGACCATTTCGGTTACTAAGTC 
GGCTTTTGCCAAACGCGCGGCGCGTTCGGCTGCGGCGGCATCCGCGCCCAAAGCCTTGGC 
GATATGGGCGGCGATGCTTTGCAGGCGTTCGATGCGTTCGGCTTGCGAACCGATTTTGTT 
GTGATAAACCACGTTCGTCAGTTTGGGCAGGCGGCTTTCCAAAGTCGCTTTTTGGTCTTG 
TTTGTAGAAGAACTCGGCATCAGACAGGCGCGCGCGCAAGACACGTTCATTGCCTTGGAT 
50 GATGTGTGACGGATCTTCGGTTTGCAGATTGGACACCAGCAGGAAGCGGTTCATCAGCTT 
GCCGTTTTGGTCGAGCAGCGGGAAGTATTTTTGGTTTTGCTGCATCGTCAGAATCAGGCA 
TTCTTGCGGTACGGCGAGGAAGTGTTCTTCAAAACCGGCTTCCAATACCACAGGCCATTC 
GACCAGCGCGGTTACTTCGTCCAACAAGGCTTCATCGGCGGCGGCGGTCGCGTTCAGACG 
GCGTGCCTGCCCTTCCAATACCGTCTGAATCGCGGCTTTGCGCTCGGCAAACGAAGCGAC 
55 GACTTTGCCTTGCTCGCGCATTTGTGCGGCGTAGCTGTCGGCGTTTTCAATGGTAATTTC 
GCCGTCGGAGAGGAAGCGGTGTCCCAAGGTTTTGTTGCCGCTTTGCAGACCCAAAACGCT 
GACGTTCACAATGTCGCCGCCGTGCAGTACAACTAGCCCGTGAACGGGGCGCACAAAGGT 
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AAACGTGCTGCTGCCCCAACGCATTACTTTGGGAATCGGCAGCTTCTTAACCGCTTGATT 
GATAATGTCTTCCAAAAGTCCGCCCAACGGTTTGCCGATTTGGACGTATTCGTAGGCGTA 
CACGTCCTGCTTGCCGTCGTGGACGATGGTCAAGTCTTCGATTTTCGCGCCCGCACCGCG 
TGCGAAACCTTCCAAAGCCTTGGTTGGCGCACCGTCTTTCATGGCATTCGCTACGGCAGG 
5 GCCTTTTTTCACAATTTTTTGATCAGCCTGAACGGCTTTGACGTTTTTGACTTGAACCGC 
CAAACGGCGCGGCGAGGCATAAGCCGTAAATTCGGCTGCGCCGTCAACCAGTTGCGCTTT 
TTCCAAGCCTTCGGCAACGGAAGCGGCGAAATGGTTGCCCAGATTATTCAGGGCTTTGGG 
CGGGAGTTCTTCGGTAAGGAGTTCGATTAAAAGGGTTTGGGTCATCATTCGGCTTTCTTT 
GAATTTGGTTAATCTGCCTGTTTATAGGTTTCGCTGTAATTTTCCCAGCCGTCATCCCCA 
10 TAAAAACCGTCAACCAGCGGGGTGGCGTACAAAGTGGCAACATCTTCGCGGTCTGCCAGC 
CAAGAGATAATGGCTTTTTTCTCGGTTTCTCCCAAGCTTCGGGCACCGGATTTTTGAAAC 
AGGCACGAAAAATCGCCGCAATCGCCCCCCCGCCATTTCAAAGCCGTTTGCCGCAAGATA 
CGCAATCAGCTCGTCCATAAAGCGGTCGAACGCTTCGGCATCGTCCTCAGCTTGGTGCAA 
ACTGCCTTGAACGCCGAAAATCAATGTTTGAAACTCGCCCAAATGCAGCTTTTTATGCTG 
15 GCGGCGGTTCATTTTGTGCAGGCGTTTCCTGCTTGGGGTGCGGAAATAGACAGGCATGAT 
TTTCCTAAAAAATATAATGGCTTCCGGACGGCTGCCTTATCGTGCCGCCCGAACGTAAAA 
AATCGTCGCCCCCTTAGGCGGCGTTTGCCTTCATTAAAGGGAAGCCCAGTTTTTCGCGGC 
TTTCAACATATTTTTGCGCCACGGCGCGGCTCAATGCACGAATACGTCCAATATAAGTTG 
CCCGCTCAGTTACGGAAATCGCGCCGCGTGCGTCTAAAAGGTTGAACGTATGCCCCGCTT 
20 TGAGGACAAGCTCGTAGGCAGGCAGGGCGAGGGCGGCGTTTTCTTCGGCAAGCAGGCGTT 
TGGCTTGCGCTTCGTAGTCGTTGAACTGGCGCAGCAGCCAGTCGGCATCGCTGTATTCGA 
AGTTGTAGGTGGATTGCTCGACTTCGTTTTGGTGGTACACGTCGCCGTAGGTGACGGTGT 
TGCCGTCGAGCGTTTTTGCCCAAACGAGGTCGTAGACGTTTTCTACACCTTGCAAGTACA 
TCGCCAAGCGTTCGATGCCGTAGGTGATTTCGCCGAGTACGGGCGTGCAGTCGATGCCGC 
25 CGACTTGTTGGAAATAGGTAAACTGGGTTACTTCCATGCCGTTGAGCCAGACTTCCCAGC 
CCAAACCCCACGCGCCGAGGGTGGGGTT T T C C C AG T C G T C T TCGACAAAG CGGATG TCG T 
GGACTTTGGGATCGATGCCCAATTCGCGCAGAGAGTCGAGATAGAGGTCTTGGATATTGG 
CGGGAGCGGGCTTGAGGGCGACTTGGAATTGGTAATAGTGTTGCAGGCGGTTGGGGTTGT 
CGCCGTAGCGGCCGTCTTTGGGGCGGCGGCTGGGTTGGACGTAGGCGGCAAACCAAGGCT 
30 CGGGGCCGAGTGCGCGCAGGCAGGTGGCGGGATGGGATGTGCCGGCACCGACTTCCATGT 
CGAAGGGTTGGATGACGGTGCAGCCTTTGTCTGCCCAGAATGTTTGCAGTTTGAAGATGA 
TTTGTTGGAAGGTAAGCATGGCTTATGATTCGATAAAATAAAGGGTTTATTTTACTGTTT 
CCATTGCTGTTTGGATAGGTTTATCTCAAAGACAGACTGATTTGAAAACACGGCATACAT 
GATATAGTGGATTAAATTTAAACCAGTACAGCGTTGCCTCGCCTTAGCTCAAAGAGAACG 
35 ATTCTCTAAGGTGCTCAAGCACCAAGTGAATCGGTTCCGTACTATTTGTACTGTCTGCGG 
CTTCGTCGCCTTGTCCTGATTTTTGTTAATCCGCTATATGTTTCGGTTAGGCGGCAGGCT 
GCCCTATTGAATACCTTAAAGCAGGCTATGCCTGCCAACGCCATATCCAAACACAGTCTT 
TAATTTAAATCCGGAAAATAAAAAGCACGACCAAACGGTCGTGCTTTTCCAAACCAAACA 
AGTTTATTTCTTGTGCGAACGGATATAGTCCAAAGTTTTGAGCTGTGCAATCGCAGCAGC 
40 CAATACTTTATGCGCTTCCGCCAAAGCCTTATCGTCTTTAGCTTGGGAAATGCCCGCTTC 
TGCGGCTTTTTTCGCCTCTTCCGCACGTGCCCGATCCATCTCCGCACTGCGGACGGCAAC 
ATCCGCCAAGACAGTTACTTTATCAGGCTGTACTTCCAAAACACCGCCGGAAACAGCAAC 
CAAAACCTCTTTATCCTCGCCCGGAACGGTCAAACGCAAAGCCCCCGGCCGCACCAAACT 
CATAATCGGCTCGTGTCGCGGATAAATACCGAGTTCGCCCTGTACAGTCGGAACAACGAT 
45 AAATGTTGCCTCGCCTGAATAGATTTTCTGCTCGCTACTTACCACCTCAACTTGCATGAT 
GCTCATGCCGACCTCCTTAGTTTAAGGTTTTCGCTTTCTCTACTGCTTCTTCAATGCTGC 
CGACCATATAGAATGCCTGCTCGGGCAGATGATCGTATTCGCCGTTCAAGATGGCTTTGA 
AGCCGGCAATGGTATCGCGCAGGGCGACATATTTACCCGGAGAACCTGTAAACACTTCGG 
CAACGTGGAACGGTTGGGACAGGAAGCGTTGGATTTTACGCGCACGCATTACGGTCAGTT 
50 TGTCTTCATCAGACAATTCGTCCATACCCAAGATGGCGATGATGTCGCGCAATTCTTTGT 
ATTTTTGCAGGGTGGACTGCACACCGCGCGCCACGTCGTAGTGCTCTTGACCCAATACCA 
TCGGATCCAGTTGGCGCGAAGTAGAATCAAGCGGATCGACTGCCGGGTAAATACCCAAAG 
AGGCAATATCGCGGCTCAATACGACGGTTGCGTCCAAGTGGGCGAACGTTGTTGCCGGAG 
ACGGGTCGGTCAAGTCGTCCGCAGGTACATATACGGCTTGGATGGAAGTAATAGAACCGG 
55 TTTGGGTAGAGGTAATACGCTCCTGCAAACGACCCATTTCTTCTGCCAATGTCGGTTGGT 
AGCCCACTGCAGACGGCATACGACCCAACAATGCGGATACTTCGGTACCAGCCAGGGTGT 
AACGGTAGATGTTGTCCACGAAGAACAATACGTCGCGGCCTTTGCCGTTTTCGTCTTTTT 
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CGTCACGGAA.GTATTCCGCCATGGTCAAACCGGTCAATGCGACGCGCAAACGGTTGCCCG 
GAGGTTCGTTCATCTGACCGTAAACCATTGCCACTTTATCCAATACGTTGGAATCTTTCA 
TCTCGTGGTAGAAGTCGTTACCTTCGCGGGTACGCTCACCCACGCCTGCGAACACGGACA 
AGCCGCTGTGCGCTTTGGCGATGTTGTTGATCAATTCCATCATGTTCACGGTTTTACCCA 
5 CACCGGCACCGCCGAACAGACCTACTTTACCGCCTTTGGCAAACGGACACAGCAAGTCAA 
TCACTTTAATGCCCGTTTCGAGCAATTCGGTTGTGGAAGACAGTTCGTCAAACTTAGGGG 
CAGCTTGGTGGATGGCACGGCTCTTGTCGGTATCGATCGGACCTGCTTCGTCAACAGGCG 
TTCCCAATACATCGACAATGCGTCCCAACGTACCTTTACCTACCGGCACAGTAATGGGCG 
CACCGGTATTGCTCACAGTCATGCCGCGTTTCAAACCGTCCGAGCTGCCCATCGCAATGG 
10 CACGGACTACGCCGTCGCCCAAAAGCTGTTGGACTTCCAAAGTCAGACCGTTTTCGTCTA 
ATTTCAAAGCGTCGTAAACGCGCGGAATCATGTCGCGTGGAAATTCCACGTCAACAACCG 
CACCGATAATTTGTACGATTTTGCCTTGGCTCATTATCGTATCCTAATTTCCGTACAGGA 
TTCAGACGGCATCAGACAGCCGCCGCACCTGCTACAATTTCTGACAATTCCGTGGTAATC 
GCAGCTTGACGCGATTTGTTATATACCAAACGCAACTCTTTGATGGCATTGCCTGCATTG 
1 5 TCTGTTGCAGCTTTCATGGCAACCATGCGGGCTGCCTGTTCGGATGCCATATTGTCGCTC 
AACGCCTGATAAACCACAGACTCTAAATAGCGGCGAACCAGATATTCCAACACTGCAAGT 
GCAGTCGGTTCGTAGCGGTATTCCCAGCTGAACGGTGATTTGGGAGCTGAATCGCCAATC 
ACGTTCTCACCGATAGGCAGCAATACTTCCATTCTCGGTTCTTGACGCATGGTATTGACA 
A/U^CCCGAATACACCAGATGGATTCTGTCAATTTCATGTTTCTCATACCGTTGGAAGAGT 

20 TCTGTCAAAGGTCCGAGCAGCATTTCCATTTTTGGGGTATCGCCCAAATTTACGGCACTG 
GCAACCACATTCAGACCAATGCTCTGACACGCCATCAGACCTTTACTGCCAAAGCATACG 
ATTTCCTCTTCAATACCTTGATTCCGATACTCTTGAACTTGTGCCAAAAACTTTTTCAGC 
ACGTTGGCGTTCAAACCGCCACACAAACCCTTATCAGACGTAATCAAAATAAAACCGACA 
CGTCTGATTTCCCGATGAGATTCCAGTAACGGAATACCATGATCGGTATTGGTTTGCGCA 

25 AGATGGCTCATCACCATACGCACTTTTTCGGCATACGGACGCGCCAAACGCATCCGTTCC 
TGAGTCTTCCGCATTTTAGAGGTTGACACCATCTGCATCGCTTTAGTGATCTTTTGGGTA 
TTCTGAACACTGCGGATTTTGGTGAGAATCTCTTTTCCTACTGCCATTTCAGACTCCTTT 
CACTTCAAGCCTTATGCCTGATAGGCGTAAGAAGATTTGAAGGATTTCATGGCTGCTTCA 
AGCGTTTTCTCGCTCTCGTCGGACATTGCACCTGAAGCATTGACGGCTTCCAAAACTTCC 

30 GGATGTTGGGTACGGACAAAGCTCAAAAATTCAGATTCAAAAGCCAGAGCTTTGGCAACC 
GGAACATCAGAATACGAACCGTTGTTGATTGCCCAAAGGGTCAAAGCCATTTCAGCCGTA 
TTCAACGTACTGAACTGTTTCTGTTTCATCAGTTCGGTTACGACTTCGCCATGCTCCAAT 
TGTTTGCGCGTAGCTTCATCCAAATCGGATGCAAATTGCGAGAACGCCGCCAATTCACGA 
TATTGTGCCAACGCCAAACGGATACCGCCACCCAGCTTTTTAATCACTTTGGTTTGTGCA 

35 GCACCGCCTACGCGGGATACGGAAATACCGGCATTGATTGCAGGACGGATACCGGCGTTG 
AAGAGGTCGGTTTCCAAGAAAATCTGACCGTCGGTAATCGAAATGACGTTAGTCGGAACG 
AAAGCAGATACGTCGCCCGCTTGGGTTTCGATAATCGGCAACGCGGTCAGAGAACCGGTT 
TTGCCTTTTACTTCGCCGTTGGTCAATTTCTCCACTTCGTGTTCATTGACACGTGCCGCA 
CGTTCCAACAGACGGGAGTGCAGGTAGAACACATCGCCGGGATAGGCTTCGCGGCCGGGC 

40 GGACGGCGCAAAAGCAGGGAAATTTGACGGTAAGCCACAGCCTGTTTGGACAAATCGTCA 
TAAACAATCAAGGCATCTTCGCCACGATCGCGGAAGAATTCACCCATCGTACAACCGGAG 
TAAGGTGCGATATATTGCAATGCCGCCGCTTCAGATGCAGTTGCAGCAACCACGATGGTA 
TGCTCCATCGCGCCATGCTCTTCCAATTTGCGGACCACGTTGGCAATAGAAGATGCTTTT 
TGACCGATAGCGACATAGATACAGATAACACCCGTACCTTTTTGGTTGACGATGGCATCC 

45 AATGCTACGGCCGTTTTACCTGTCTGACGGTCGCCAATAATCAACTCACGCTGACCGCGA 
CCGACAGGAACCATAGAGTCAATCGCCTTCAGACCGGTTTGCATCGGCTGGTCAACCGAT 
TTGCGCGCAATCACGCCCGGTGCGATTTTTTCGATAGGGGCGGTCAAAGTTGTATTAATC 
GGGCCTTTGCCGTCGATAGGCCGACCCAATGCATCAACGACGCGTCCGACCAGTTCGCGT 
CCGACCGGCACTTCCAAGATACGACCGGTACAGGTAACCGTGTCGCCTTCTTTAATGTGT 

50 TCGTACTCGCCCAACACTACGGCGCCGACGGAGTCGCGCTCCAGGTTCATCGCCAAGCCG 
AAAGTGTTACCCGGGAATTCGAGCATCTCACCTTGCATTGCATCTGACAAACCATGGATG 
CGAACGATACCGTCAGTTACCGAAATTACCGTACCACAGGTACGCACTTCGGCATTTACA 
GACAGATTTTCGATCTTGGCTTTAATCAAATCGCTAATTTCAGCAGGATTAAGCTGCATG 
AAAACTCTCCTAATTCGTCATAGTCGTGTACAAGGCACTCAATTTGCCTTGTACAGACAA 

55 ATCCAAAACCTGATCACCCACTTCAACTTTTATGCCGCCAATCAGCTCCGGTTCGATTTC 
GACAGAGATTTTCAGCTCGCTGTCGAAACGCTTATTCAGCATTTGCACCAACTCGCCGAC 
CTGTTTGTCGGTCAACGGATAGGCACTGTAAATGACGGCAGATTTGATATGGTTGAATGA 
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TAAGGTCAA.GTCTTGATATTGAGCATATACTTCCGGCAATATCGACAAACGTTTCTGCCC 
GGCCAAGACGATAACAAAGTTTTTCAACTCCTTGTCTTTCAAACCGACCAAATCGATGAG 
GATATCTGCTTTTTCTGAAGCATTCGTTTCAGGACGGTCAATCAATGAAGCCACCTTCCC 
TTCCTGAACAACCGCCGCAAGTTTTTCCAGTCCGCCCAACCAAGACTCAATTTGGTTTTT 
5 TTCCTGAGCCAGACCGAACAATGCCTTTGCATAAGGTCTGGCAATCGTTGCGAACTCTGC 
CATAAGATTACAGCTCCTGTTTCAGGGTATCGAGCAGTTTTGCGTGTTTGGAAGCATCGA 
CTTCGCTGCGCAAAATAGATTCGGCACCTTTGACAGCCAACACGGCAACCTGCTCGCGCA 
GGGATTCGCGTGCGCGGAACAATTCCTGCTCCACATCGGCCTTTGCCTGAGCTGCAATGC 
GCGCCGCCTCGGAAGAAGCCTGTTCTTTGGCTTCTTCGACAATTTTGGCGGCACGTTTTT 
10 CGGCGTTGGCAACCATTTCGGAAACCTGATTACGCCCTTCTGCCAAGAGTTCTGCAACCT 
TTTTTTCAGCCTGCTCAAAATCGCTTTTACCACGCTCGGCGGCAGCCAAGCCTTCGGCGA 
CTTTTGCGGCACGCTCATCCAAAGCTTTTGCAATCGGCGGCCACACGAATTTCATGGTAA 
ACCATACCAAACCGAAAAAGACGATGATTTGAGCGAATAATGTTGCATTGATATTCT^CGT 
TACTTAACCTTCGTACTGGGGTTAATCAAACAGGCTGCGCCTGTACGGAACGGACGAATC 
1 5 CGTCCTGATTATGCACCTGCAAACGGGTTAACGAAGGCGAACAGCAGTGCAATGGCGACA 
CCAATCAAGAATGCGGCATCAATCAAACCGGCAATCAGGAACAGTTTGGTTTGCAGCGGA 
CCGATCAGTTCGGGCTGACGGGCAGAAGACTCCAAATATTTAGAACCGACCATTGCGATA 
CCGATAGAGGCACCCAATGCACCCAATGCAACGATCAAACCACATGCGATAGCAATCAAA 
CCCATTTTAAACTCCTTAAAGAAACAAAGGTTAAACTACAAAAACAAACTACTTAGGAAA 
20 ATCAGTGCGCATCATGTGCCTGTCCGATATAGACGAACGCCAACGCCATGAAAATAAACG 
CCTGCAGGGTAATCACCAAAATATGGAAAATCGCCCATGCCAAACCGGCAATAATGTGGA 
ATACAAACAGAATCGGATCCATGACTTCGACGCTGCCGGAAGCCGCCCAAGCACCGCCAA 
GCAAGGCTATCAACAAGAATACCAATTCGCCCGCATACATATTGCCGAACAACCGCATAC 
CGTGGGATACGGTTTTAGAAAGAAACTCGACCAAATTCAACAGAAAGTTCGCAGGTGCGA 
25 GTTTTGCACCGAACGGCGCGCTGAACAACTCGTGAAACCAGCCACCCAATCCTTTGATTT 
TGATGTTGTAATAGATACAAATCAGCAACACGCCGACAGCGAGTGCCAAAGTGGTGTTCA 
AATCGGCAGTCGGTACGACGCGCAGCAGGGCGTGATGGTTGCCGGTAATGCCCTGCCATA 
CCATCGGCAGCAAATCGACCGGCAGCATATCCATCGCGTTCATCAGAAAAATCCAGACAA 
ACAGCGTCAGACCCAACGGCGCGACGGCTTTTCTAGACTTTTCGTTGTGAATGATGCTCT 
30 TACACATATCGTCCACAAACTCAAACAAGATTTCCACTGCGGCCTGGAAACGTCCGGGAA 
CGCCTGCCGTCGCTTTTTTTGCACCGCGCCACAACAGAAAGCTGCCGATTACGCCCAACA 
GGACGGCAAAAAAGACGGCATCAJiGGTTAATAAACGAAAAATCAGCAATGTTTTTCAGTC 
CCTGACCCTGAGTAACATCCGACAAACTGGTCAAGCTCTGCAAGTGGTGCTTGATGTAGT 
CGGCAGCGGTAATGGTTTCACCTGCCATAATCTTTCACTCTCAACAATACTAAAAAAACC 
3 5 AAATGGCTGACACCGAGCAGCCCCATCAGAAACGGGGCGAACACCAGCGATTGATGCCAT 
ATTGCAAATACGGCAAGCATGGACAACAGCGACAGCACTACTTTTAAAATCTCTCCGAAG 
ACGAACATCCTGCTTTGCAGGAAGGGGTTTCCCCTGAAAAGTTTTAAAAGTAAAACTGCA 
ACAAACGTGGGAAGCAGGTAGGACAAACCGCCACCGACCGCCGAAAGGAATCCGGCAAAA 
CCCCATACAGCAAAGGCAACTGCGGCGCATATGGACAATACGGCGGATTGTAGGATGATA 
40 ATCTGCTTCATAAAGGGAATGTTTCCGCCTCGGATTTGGGGCGCGGCTAATATAATTTAG 
AAGCCTTATTACGTCAAGCGACAGTTAATCTTTGTGAAACAACGTATCCCAATCCGCCGC 
GCTCGCCGCCTGAATAACGGCGACAGGTGTCATTCTAACACACATTACATATAATTACAG 
GATATTAAGGAGTTTGTCCGCAATTTCTTTACATTTTTAATGTTCTTACGTGATTTGTTT 
GCTTTACGTGGAAATAATAAAAAATCAACGCGAAATTGTAGCAGTTTATCGGTCGGATTG 
45 TCGGCAGTTTGGGGAATTTGCTCAATAAATAAAAGGTCGTCTGAAAATATTTTCAGACGA 
CCTTTTCCGAATAAAGGATTAGCAACTGCCTGCCGCTTTAAGCAAAGCATTGCATTGACT 
TTTGCCTTTGTGCGTTCCGCCTCCCAAACAAATTGCATCGGAAGTGGTAACGCCGATTGT 
GCTGATTACACTGGTAACATAGCATTGGCTCACGCGCTTACCCACAGTTGCGGTAAAGTT 
GATGCGTATGCCTTCATTGTTGCGGTTGCTGATTTTTACGGCATTTGGGCTGACGCCCAA 
50 GGCAAACGCGGCACGTTCCTGAAGTTTCTAGTCGGAAACGGTTACATTATTGATTGAGCC 
GCAACCTGCTAATGCCAACGCAACGAACGCAGCCGAAACGATGATGCGTGTGTTCATAAT 
TTCCTCGAAAATTAAAAATGAAAACAGGAAAACGATTCTTACGTGAAGCAGAAAAAATGT 
CAATAGAATTATATTTCCCACTTAAAATCTGGAAAGCTATTCTCTATATTTCAGACGGTA 
TATCCCGCAAAATTAAGGCCGGTAATCTATGCCCAACTGCTCCAGCAGGTGGCCGAACGT 
55 TTCAGGCGTATCGAAATACAGGACAATCCTGCCTTTTTTGTGGTTGGCGGTTTTGACTTC 
AGCGTTGACACCCAGTTTTTCAGTCAGCAAATCATTCAGGCGGCCGATGTCGGCGGCGGC 
AGTCTTTTTGGGCTCGGGACGTTTGTTTTGAAGGGCGGCCTGGCTGCGGCGTTCGACTTC 
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GCGCACCGACCAGCCGTTTTTGACGGCCTTTTGCGCCAATTCGAGCTGTTCGACGAGTTT 
TCGGAGAGGACCATGCGCGGGCGTGCCCCATTTC 



5 The following partial DNA sequence was identified in N. meningitidis <SEQ ID 5>: 
gnni_5 

CAGACATTACCGTGTACAACGGCCAACACAAACGAAGCAGCACAAGCCGTTGCAGATGCC 
TTTACCCGGGCTACCGGCATCAAAGTCACACTCAACAGCGCCACACGCGACCAGCTTGCC 
GGTCAAATCAAAGAAGAAGGCAGCCGAAGCCCCGCCGACGTATTCTATTCCGAACAAATC 
1 0 CCGGCACTCGCCACCCTTTCCGCCGCCAACCTCCTAGAGCCCCTGCCCGCCTCCACCATC 
AACGAAACACGCGGCAAGGGCGTGCCGGTTGCCGCCAAAAAAGACTGGGTGGCACTGAGC 
GGACGTTCGCGCGTCGTCGTTTACGACACCCGCAAACTGTCTGAAAAAGATTTGGAAAAA 
TCCGTCCTGAATTACGCCACGCCGAAATGGAAAAACCGCATCGGTTACGCCCCCACTTCC 
GGCGCGTTCTTGGAACAGGTTGTCGCCATCGTCAAACTGAAAGGCGAAGCGGCCGCATTG 

1 5 AAATGGCTCAAAGGTCTGAAAGAATACGGCAAGCCTTACGCTAAAAACTCCGTCGCCCTT 
CAAGCGGTTGAAAACGGCGAAATCGATGCCGCCCTCATCAACAACTACTACTGGCACGCT 
TTTGCGCGTGAAAAAGGCGTACAAAATGTCCACACCCGCCTGAATTTCGTCCGCCACAGA 
GATCCCGGCGCACTCGTTACCTATTCCGGCGCAGCCGTGTTAAAATCCTCCCAAAACAAG 
GATGAGGCGAAAAAATTCGTCGCCTTCCTCGCCAGCAAGGAAGGACAGCGCGCCCTGACC 

20 GCCGTCCGTGCCGAATATCCTTTGAATCCGCACGTGGTATCCACTTTCAATTTGGAACCC 
ATCGCCAAGTTGGAAGCACCCCAAGTGTCCGCCACCACTGTTTCCGAAAAArAACACGCC 
ACCCGGCTGCTTGAGCAAGCCGGTATGAAATAAGCCGTTTTCGGATTGTCAAACGGGTGG 
ACATTTATACTGTCCGCCCGTTTTGCCGATAAAAAACACTATGTCTCCTAAAAAAATACC 
CATTTGGCTTACCGGCCTCATCCTACTGATTGCCCTACCGCTTACCCTGCCTTTTTTATA 

25 TGTCGCTATGCGTTCGTGGCAGGTCGGCATCAACCGCGCCGTCGAACTGTTGTTCCGCCC 
GCGTATGTGGGATTTGCTCTCCAACACCTTGACGATGATGGCGGGCGTTACCCTGATTTC 
CATTGTTTTGGGCATTGCCTGCGCCCTTTTGTTCCAACGTTACCGCTTCTTCGGCAAAAC 
CTTTTTTCAGACGGCAATCACCCTGCCTTTGTGCATCCCCGCATTTGTCAGCTGTTTCAC 
CTGGATCAGCCTGACCTTCCGTGTCGAAGGCTTTTGGGGGACAGTGATGATTATGAGCCT 

30 GTCCTCGTTCCCGCTCGCCTACCTGCCCGTCGAGGCGGCACTCAAACGCATCAGCCTGTC 
TTACGAAGAAGTCAGCCTGTCCTTGGGCAAAAGCCGCCTGCAAACCTTTTTTTCCGCCAT 
CCTCCCCCAGCTCAAACCCGCCATCGGCAGCAGCGTGTTACTGATTGCCCTGCATATGCT 
GGTCGAATTTGGCGCGGTATCCATTTTGAACTACCCCACTTTTACCACCGCCATTTTCCA 
AGAATACGAAATGTCCTACAACAACAATACCGCCGCCCTGCTTTCCGCTGTTTTAATGGC 

35 GGTGTGCGGCATCGTCGTATTTGGAGAAAGCATATTTCGCGGCAAAGCCAAGATTTACCA 
CAGCGGCAAAGGCGTTGCCCGTCCTTATCCCGTCAAAACCCTCATIACTGCCCGGTCAGAT 
TGGCGCGATTGTTTTTTTAAGCAGCTTGTTGACTTTGGGCATTATTATCCCCTTTGGCGT 
ATTGATACATTGGATGATGGTCGGCACTTCCGGCACATTCGCGCTCGTATCCGTATTTGA 
TGCCTTTATCCGTTCCTTAAGCGTATCGGCTTTAGGTGCGATTTTGACTATATTATGTGC 

40 CTTGCCCCTTGTTTGGGCATCGGTTCGCTATCGCAATTTTTTAACCGTTTGGATAGACAG 
GCTGCCGTTTTTACTGCACGCCGTCCCCGGTTTGGTTATCGCCCTATCCTTGGTTTATTT 
CAGCATCAACTACACCCCTGCCGTTTACCAAACCTTTATCGTCGTCATCCTTGCCTATTT 
CATGCTTTACCTGCCGATGGCGCAAACCACCCTGAGGACTTCCTTGGAACAACTCCCAAA 
AGGGATGGAACAGGTCGGCGCAACATTGGGGCGCGGACACTTCTTTATTTTCAGGACGTT 

45 GGTACTGCCGTCCATCCTGCCCGGCATTACCGCCGCATTCGCACTCGTCTTCCTCAAACT 
GATGAAAGAGCTGACCGCCACCCTGCTGCTGACCACCGACGATGTCCACACACTCTCCAC 
CGCCGTTTGGGAATACACATCGGACGCACAATACGCCGCCGCCACCCCTTACGCGCTGAT 
GCTGGTATTATTTTCCGGCATCCCCGTATTCCTGCTGAAGAAATACGCCTTCAAATAACA 
GCTTGAGGAAGTACCGCCATGACCGCCGCCCTGCACATCGGACACCTGTCCAAAAGTTTT 

50 CAAAACACCCCAGTTTTAAACGACATTTCGCTCAGCCTCGACCCGGGCGAAATCCTCTTT 
ATCGTCGGCGCGTCCGGCTGCGGCAAAACCACCCTTTTACGCTGCCTTGCCGGTTTTGAA 
CAACCCGATTTTGGCGAAATTTCGCTTTCCGGCAGAACCATCTTCTCGAAAAATACCAAC 
CTCCCCGTCCGCGAACGCCGTTTGGGTTATGTCGTACAGGAAGGTGTGCTGTTCCCCCAC 
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CTGACCGTTTACCGCAACACCGCCTACGGGCTGGGCAACGGCAAAGGCAAGACGGCGCAA 
GAGCGGCAGCGCATCGAAGCTATGTTGGAATTGACCGGCATTTCCGAACTTGCCGGACGC 
TATCCGCACGAACTTTCGGGCGGACAGCAACAGCGCGTCGCCCTCGCCCGCGCCCTCGCA 
CCCGATCCCGAACTGATTTTGTTGGACGAACCCTTCAGCGCGCTGGACGAACAGTTGCGC 
5 CGCCAGATTCGCGAAGACATGATTGCCGCCCTGCGCGCCAACGGCAAATCTGCCGTTTTC 
GTCAGCCACGACCGCGAAGAAGCCCTGCAATACGCCGACCGGATTGCCGTGATGAAACAG 
GGGCGCATCCTCCAAACTGCAAGCCCTCACGAATTGTACCGACAACCTGCCGACCTTGAT 
GCCGCCCTGTTTATCGGCGAAGGCATCGTGTTCCCCGCCGCGCTCAACGCCGACGGCACC 
GCCGATTGCGGATTGGGCCGCCTGCCCGTTCAAAGCGGCGCACCCGCAGGCACACGCGGC 

1 0 ACACTGCTCATCCGTCCGGAACAGTTCAGCCTTCACCCCCATTCCGCACCCACCGCCTCC 
ATTCACGCCGTGGTTCTCAAAACCACGCCCAAAGCGCGGCATACCGAAATCAGCCTCCGG 
GTCGGACAAACCGTCCTCACGCTCAACCTCCCTTCCGCCCCCACCCTGTCAGACGGCATT 
TCCGCCGTCCTCCATTTGGACGGTCCCGCCCTGTTCTTCCCCGGAAATACCCTCTGAAAG 
GCGGCAGCATCCACAAGCCTGCGGATATTTATCTTGTTGGAAACAGAATTTGTTTGCTAT 

1 5 ATTCAACCTGCGCGCCTCAAGCCAAACACGGCACGCACGGCACGCAGGCAGCCGTTTCTG 
CCTATATGCCGCCCCTTCCAACCACATATGCCGCACACCGCAGCATACGAAAGGATATAT 
CATGGCAAAAGTACTCATCGTACCCGTATCTGCCGGACTGGACGCCTCCGCCGCCGCACA 
AGCCTTTGCAAAAGCACTGGACGCACAAATTTTCCAAGCCGTTGACGCAACCGCCGAAAC 
CCTGCTCGCGCAAGGCAAAAGCGACGACTGGTTCGACGCACTGGTCGGCAAAGTTGCCGC 

20 ACTCGATGCCGCCAACCTCGTCATCGAAGGCATCGCGCCCGATGCCGACAAAATCTACCT 
CGCAGGCAAAAACGTCGAACTGGCATTGTCCCTTGACGCGGCAGCCGTCTTCGCCGTCCG 
TTCCGACAACGCCGATGCCGACGAACTGGCAAATCGGGTGAACCTTGCCAAACAGTTCTT 
CGCCGCCGCGCCGGGCGTATTGGAAGGTTTTGTCGTGGACGGCGCGGCAGCCTCCGTTGC 
CGAAGCGGCAGCCGAAAAAACCGGCCTGACCTTCTTCGGTTCGAGCGACGCGCTGAAAGA 

25 CGTATCCGTATTGGCAGGCCGCGAAGCAAAACGCCTGTCGCCGGCGCAATTCCGCTACAA 
CCTGATCGACTTCGCCCGCCAAGCCGACAAACGCATCGTCCTGCCTGAGGGCGCAGAACC 
CCGCACCGTCCAAGCCGCCGCCATCTGCCACGAAAAAGGCATTGCCCGCTGCGTCCTGCT 
TGCCAAACGCGAAGAAGTCGAAGCCGTTGCCAAAGAACGCGGCATCAGCCTGCCCGACTC 
TTTGGAAATCATCGATCCCGCCTCATTGGTCGAACAATACGTCGAGCCGATGTGCGAACT 

30 GCGCAAATCCAAAGGCCTGACACCCGAAGACGCGCGCAAGCAACTGCAAGACACCGTGGT 
ACTCGGTACGATGATGATGGCGCAAAATGATGTGGACGGTTTGGTATCCGGTGCGGTTCA 
CACCACCGCCAACACCATCCGCCCCGCTTTGCAACTGATTAAAACCGCACCGGGCGCAAG 
CCTCGTGTCCAGCGTATTCTTTATGCTGCTGCCCAACCAAGTCCTCGTCTTCGGCGACTG 
CGCGGTTAATCCGAACCCGACCGCGCAACAGCTTGCCGACATCGCCATCCAGTCTGCCGA 

35 TTCCGCAAAAGCCTTCGGCATCGACCCGAAAGTGGCGATGATTTCCTACTCCACCGTCAA 
CTCCGGCAGCGGCCCCGATGTCGATACCGTCATCGAAGCAACCAAACTTGCCCGGGAAAA 
ACGCCCCGACCTCGCCATCGACGGCCCGCTGCAATATGATGCGGCAACCGTGCCGGGTGT 
GGGCAAATCCAAAGCTCCGGGCAGCCCGGTGGCAGGACAGGCAACCGTTTTGGTCTTCCC 
CGACCTGAACACCGGCAACTGCACCTATAAAGCCGTCCAACGCAACGCCAACGTCTTAAG 

40 CGTCGGCCCGCTGCTGCAAGGCCTGCGTAAACCGGTCAACGACCTCTCCCGCGGCGCACT 
GGTAGAAGATATCGTGTTTACCATCGCCCTGACTGCCGTTCAGGCAAAACAAATGGAAGG 
CTGACAAACGGCTTTCCGGGTTTAAACCCTATGCCGTCTGAAGACAGACTCCCGTTTTCA 
GACGGCATTTTTATCAGCACGGCACATTTGTTTGTTAAAATCGCAGCCATATTGCAAAAA 
AAGAGGAGGAAGCCATGCAAACCGCCATTATCGATTACGGTATGGGCAACCTGCATTCCG 

45 TATTGAAATCCGTCCGGACGGCGGGGCAGCTTGCCGGAAAAAATACCGAAATCTTTTTAA 
GCGGCGACCCCGACCGCGTGTCCCGCGCCGACAAAGTCATTTTTCCCGGTCAGGGCGCGA 
TGCCCGACTGTATGGCGGCATTAAAACGAGACGGTTTGGACGAGGCAGTCAAAGATGCCT 
TAAAAAACAAACCGTTTTTCGGAATCTGCGTCGGCGCGCAACTTTTATTCGACCACAGTG 
AAGAAGGAAACACCGACGGCTTGGGCTGGTTCGGCGGCAAAGTCAGACGCTTTGAGCGCG 

50 ACCTCCGCGACCCGCAGGGATGCCGTCTGAAAGTCCCGCATATGGGCTGGAACACCGTGC 
GCCAAACCCAAAACCACCCGCTGTTTAAAGATATTCCCCAAGACACGCGTTTTTACTTCG 
TCCACAGCTACTATTTCGCCCCCGAAAATCCCGAAACCATATTGGGCGAAAGCGACTACC 
CGTCCCCGTTTGCCTGCATCGTCGGCAAAGACAACGTATTCGCCACGCAATTTCACACCG 
AAAAAAGCCACGATGCCGGGCTGACGATGTTGAAAAACTTTTTAAACTGGTAAGCCGGAC 

55 ACGGCCCCGCACAAGGAGAAAAATTATGCTGCTGATACCCGCCATCGATTTGAAAGAAGG 
ACGCTGCGTCCGCCTGAAACAAGGGCTGATGGAAGAGGCGACCGTCTTTTCCGATTCGCC 
CGCCGAAACCGCGCTGCACTGGTTCAAACAAGGCGCGCGCCGCCTGCATCTGGTAGATTT 
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GAACGGCGCGTTTGCCGGCGTTCCGCAAAACCTGCCCGCCATCAAAGACATCCTTGCCGC 
TGTCGCCAAAGACATCCCCGTACAGCTCGGCGGCGGCATACGCGATTTGAAAACCATCGG 
ACAATATTTGGATTTGGGCTTAAACGACGTGATTATCGGCACGGCGGCGGTCAAAAACCC 
CGACTTCGTGCGCGAGGCGTGCAAAGCCTTCCCCGGCAGGATTATTGTCGGGCTGGATGC 
5 CAAAGACGGTATGGCCGCCATCGACGGCTGGGCAACCGTAACCGGGCATCATGTAATTGA 
TTTGGCAAAACGCTTTGAAGACGACGGCGTCAACAGCATCATCTACACCGACATCGGGCG 
CGACGGTATGATGAGCGGCGTGAACATCGACGCGACGGTCAAACTCGCCCAAACCGTCCG 
CATTCCCGTCATCTCCTCCGGCGGACTGACCGGCTTGGACGACATCCGCGCCCTGTGTGC 
CGCCGAATWVCATGGCGTAGCAGGCGCGATTACCGGCCGCGCGATTTACGAGGGTAGCAT 
1 0 CGATTTTGCCCAAGCGCAGCAACTGGCAGATTCCCTCGACTAAAGGCATCCGATTATGGC 
ACTGGCAAAACGCATCATCCCCTGTCTCGACGTAAAAGACGGGCGCGTCGTCAAAGGCGT 
GAACTTCATCGGTTTGCGCGACGCGGGCGACCCCGTCGAAGCCGCCAAACGCTACAACGG 
CGAAGGCGCGGACGAATTGACCTTCCTCGACATCACCGCCTCATCCGACAACCGCGACAC 
CATCCTGCACATCATCGAAGAGGTTGCCGGACAAGTCTTCATCCCCCTGACCGTCGGCGG 
1 5 CGGCGTACGCACCGTTGCCGACATCCGCCGCCTGCTCAATGCCGGCGCGGACAAAGTCAG 
CATCAACACCGCCGCCGTTACCCGTCCCGATTTAATTGACGAAGCCGCCGGATTTTTCGG 
TTCGCAAGCCATCGTCGCCGCCGTCGATGCCAAAGCCGCCAACCCCGAAAACACACGCTG 
GGAAATCTTTACCCACGGCGGGCGAAATCCGACCGGTTTGGATGCGGTGGAATGGGCGGT 
CGAAATGCAAAAACGCGGCGCGGGCGAAATCCTGCTCACCGGTATGGACAGGGACGGTAC 

20 GATyVCAGGGTTTCAACCTGCCGCTGACCCGCGCCGTTGCCGAAGCCGTCGACATCCCCGT 
CATCGCCTCCGGCGGGGTCGGCAATGTCCGGCACCTGATTGAAGGCATAACCGAAGGCAA 
AGCCGATGCCGTACTTGCCGCCGGCATTTTCCATTTCGGGGAAATCGCCATCCGCGAAGC 
CAAACGCGCTATGCGCGAAGCCGGCATCGAAGTGCGCCTCTGACCGCCTCGACTATGCCG 
TCTGAAAGGAAATATGGATAAAAACCTGCTTGAAGCCGTCAAATTTGACGAAAAAGGTTT 

25 GGTTTGCGCCATCGCCCAAGATGCCGAAACCAAACGTATTTTAATGGTGGCGTGGATGAA 
CGCCGAAGCCCTGCAAAAAACCGTCGAAACCGGCTTTGCCCACTATTACAGCCGTTCGCG 
CCAAAAACAATGGATGAAGGGCGAAGAGTCGGGACACACGCAAAAAGTCCGCGCACTGCG 
CCTCGACTGCGACGGCGACGCCATTGTGATGCTCATCGCCCAAAACGGCGGCATCGCCTG 
CCACACCGGGCGAGAAAGCTGCTTTTACAAAGTCTGGCGTGGCAGCGCGTGGGAAACCGC 

30 CGATGCCGTCCTGAAAGACGAAAAAGAGATTTACGGCAGCACGCACTGACCGCCTCCAAC 
ATTGAATTATCAGGCATTTTTTTGTACAATTTCGCCGTCTCAAACACTGTCCGGGCCGTC 
TGAAAAGCGGCCTGAACCTTTTTGCAAAGAAAACCATGTCCCAAGAAATCCTCGACCAAG 
TGCGCCGCCGCCGCACGTTTGCCATCATCTCCCACCCTGACGCAGGTAAAACCACGTTGA 
CTGAAAAACTCTTGCTGTTTTCGGGCGCGATTCAGAGCGCGGGTACGGTAAAAGGCAAGA 

35 AAACCGGCAAATTCGCCACTTCCGACTGGATGGAAATCGAGAAGCAGCGCGGCATTTCCG 
TGGCATCAAGTGTGATGCAGTTCGATTACAAAGACCACACCGTCAACCTCTTGGACACGC 
CGGGACACCAAGACTTCTCCGAAGACACCTACCGCGTTTTAACCGCCGTGGACAGCGCAT 
TAATGGTCATCGACGCGGCAAAAGGCGTGGAAGCGCAAACCATCAAGCTCTTAAACGTCT 
GCCGCCTGCGCGATACACCGATTGTTACGTTTATGAACAAATACGACCGCGAAGTGCGCG 

40 ATTCCCTGGAACTTTTGGACGAAGTGGAAAACATTTTAAAAATCCGCTGCGCGCCCGTTA 
CCTGGCCGATCGGTATGGGCAAAAACTTCAAGGGCGTGTACCACATCCTGAACGATGAAA 
TTTATCTCTTTGAAGCTGGCGGCGAACGCCTGCCGCACGAGTTCGACATCATCAAAGGCA 
TCGATAATCCTGAATTGGAACAACGCTTTCCGTTGG?AATCCAGCAGTTGCGCGACGAAA 
TCGAATTGGTGCAGGCGGCTTCCAACGAGTTTAATCTCGACGAATTCCTCGCCGGCGAAC 

45 TCACGCCCGTATTCTTCGGCTCTGCGATTAACAACTTCGGTATTCAGGAAATCCTCAATT 
CATTGATTGACTGGGCGCCCGCGCCGAAACCGCGCGACGCGACCGTACGTATGGTCGAGC 
CGGACGAGCCGAAGTTTTCCGGATTTATCTTCAAAATCCAAGCCAATATGGACCCGAAAC 
ACCGCGACCGTATTGCCTTCTTGCGCGTCTGCTCCGGCAAATTCGAGCGCGGCATGAAGA 
TGAAACACCTGCGTATCAACCGCGAAATCGCCGCCTCCAGCGTGGTTACCTTCATGTCGC 

50 ACGACCGCGAGCTGGTTGAAGAAGCCTACC-CCGGCGACATTATCGGCATCCCGAACCACG 
GCAACATCCAAATCGGCGACAGCTTCTCCGAAGGCG.z\ACAACTGGCGTTCACCGGCATCC 
CATTCTTCGCACCCGAACTGTTCCGCAGCGTACGCATCAAAAACCCGCTGAAAATCAAAC 
AACTGCAAAAAGGCTTGCAACAGCTCGGCGAAGAAGGCGCGGTGCAGGTGTTCAAACCGA 
TGAGCGGCGCGGATTTGATTTTGGGCGCGGTCGGCGTGTTGCAGTTTGAAGTCGTTACCT 

55 CGCGCCTCGCCAACGAATACGGCGTAGAAGCCGTGTTCGACAGCGCATCCATCTGGTCGG 
CGCGCTGGGTATCGTGCGACGACAAGAAAAAACTGGCTGAATTTGAAAAAGCCAACGCGG 
GCAACCTCGCCATCGACGCAGGCGGCAACCTCGCCTACCTCGCCCCCAACCGCGTGAATT 
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TGGGACTCACGCAAGAACGTTGGCCGGACATCGTGTTCCACGAAACACGCGAACATTCGG 
TCAAACTGTAAAAAGCAATCGGCGATAAAATGCCGTCTGAACCCGAAAAAAGGCTTTCAG 
ACGGCATTTTTGCCTGCAACTCAAATGCACCGATCAAAATCAAGACGCATCGGATACCGT 
TATCGGGCATCCCGTCCTCATGAATTTAGGGCTGACGCAAGAACGCTGGCCGGACATCGT 
5 GTTCCACGAAACGCGCGAACATTCGGTCAAACTTTAAAAAACAATCGGCAATAAAATGCC 
GTCTGAACCCGAAAAAAGGCTTTCAGACGGCATTTTTGCCTGCAATTCAAACGCAGACGG 
TCAAAATCAAGGCGCATCGGATACCGTTATCGGATGCGTCCCATCCGCATGAATTTGGGG 
CTGACGCAAGAACGCCGATGTGATTTCACATCCCGTACTGTTTCGACAGCTTCACATAAT 
GCGCGGCGGAATATTTCAAAAAGGCTTTTTCCTCATCGGTCAGCACGCGCACCTGTCTGA 
1 0 CCGGCGAACCGACATAAAGATAGCCGCCCGCCAAGCGTTTGCGCGGCGGAACGAGGCTGC 
CCGCGCCGATCATCACTTCGTCCTCAATCACGGCATCGTCCAGAACCGTCGTCCCCATGC 
CGACCAGGACGCGGTTGCCGATACGGCAGCCGTGCAGCATCACTTTGTGCCCCACGGTAA 
CGTCTTCGCCGATAACCAGCGGCGATCCTTCGGGTTTGGCGGCGGTTTTGTGGGAAACGT 
GCAAGACGCTGCCGTCCTGTATATTGCTGCGCGCGCCGACGGTGATGCTGTTCACATCGC 
15 CGCGCAACACGGCGCACGGCCACACGGAAACATCTTCGGCAAGCGACACTTCGCCAATGA 
CGACGCAGGCTTCGTCTATCATACAGGTTTCGTGGATTTCGGGCGTGCGGTTTTGGAAAG 
TTCGGATTGCGTTCATTTTTCCTCCTTCGGTAAGGTATATATTGTTAAAGGATTTATTAA 
ATATTCCCCCTGATTGCTTTTAAAATCCTGCCTGTAATATCGACCCCGAGTAATGTGATT 
ATCGGGAATATCAGCTTATATATCAATTTATTGGACTTTAACAGCATAAACCTTAAATGA 
20 TACGCCCTTCTTTTTATATCAGCATCACACTCTATATTTTTACTCGTCATTATAAAAAGC 
AAAACGAGATATTCGTAGGAAAGAAAAGAATAAAGATAACTCGATATATCCCTATTAAAT 
TCCATTTCCGCATTTTTCTCCAAAATATATAATAATGACTTTATACTTTTTTCCGAAACA 
GTTCCCGTAATAGAATCTTTTCTTCCCTGCCGATAATAGTAATAACAACCGTCCAAATAA 
GAAAAAGTTGTTGCCGCATTAAATAACCTCATTGACCATTCGATATCTTCAGAATAAATT 
25 CCCCTTTCAAAAAACAGTTTTTCTCTAATAATCAATTCTCTTTTTATAATCTTATTCCAC 
GCCGAACCCGGAAATTTTCTAAATCGGCATAATCCTTTCAAAACTTCGACTTTGGATTGA 
TTGAGTATTTTTTCAGGCTGATAATCTTCGCCAAAATATGAAACACTTCCCTTATCATAT 
TTAACCGCATTTAAAAACACCACATCCGGCATATCAGTATCATCTTTACTAAGAAAATCC 
AGCAAAATCTGACAGTTAATAAAATCATCCGAATCAATAAAGACTATATATTTTCCGTTT 
30 GAATTTTTTATTCCGGTATTTCTCGCCTCCGACAACCCTTGGTTATCCTGATATATATAT 
TTGATATTTGGTGTCCGGTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGT 
TGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTT 
TGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTT 
GTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTTGTTGTTT 
35 GTTGTTTGTTGTTTGTTGTTTGCTGTTTGATATTTTATCTATATATTTGTAACATATATC 
TTCACTTCCGTCTTTTGACCCGTCATTCACAAGGATAAGTTCGACATTTTCATTACTTAA 
TATAGATTCTATGGAACTTAAACACGCTTCCAAATAACTTTCGACATTATAAATAGGAAC 
TACGATACTTAAATCCATATCTTTCTCATTTTACTAAATCATTTTAATCTTAACCCAATC 
ATAACCGGCAGGAGGGGAAACGCCCCCCTGTTTGATAACGGACGCGCCGTTTCCTGCCGC 
40 CCGAAAGGTTTCAGACGGCAGGGATTCCGGTTATTTGCCCGCTTTGAGCCCTTGCCACAG 
CTTCACGCCCAGTTTGACGGATTCTGCGGATTTGGGCGATACGATGAAACTTTTTTCCAT 
CAGTTCTTTGTTCGGGAAAATCGATGCGTCGGAGGTGTATTTTTCATCCATCAGCTCGCG 
CGCGGGACGGCTGGCGGGCGCGTAGGTAACGAAGCTGCCGTTTTTCGCCGCCACCTCGGG 
CCGGAGCGTGTAGTCGATATAGCGGTGGGCATTGGCAACGTTTTGCGCGTCGCGCGGAAT 
45 CATAAAGGAATCCACCCACACGCCCACGCCGGTTTTCGGGGTCAATACTTTGATTTCCAC 
GCCGTTTGCGGCTTCTTCGGCACGGGTTTTGGCAATGTTCAAATCGCCGCCGTAACCGAT 
GGCGGCACACAGGTTGCCCGCCGCCATATCGTCGATATAGCCGGAAGAGCTGAAGCGTTT 
CACGTCGCCCCGGACGGCTTTCATCATATCGACGGCGGCTTTGATGTCTTCGGGATTCTC 
ACTGTTGGGGTCTTTGCCCAAATAGTGCAACGCCAAGGGAATCTGTTCGATTGCGCTGTC 
50 GAAATAGCTGATGCCGCAGGATTTGAGTTTGGCGGTGTATTCGGGTTTGAACACCAAATC 
CCATTCGTTTTCGGGCAGCTTGTCCGTACCCAATGCTTTTTTCACCTGCTGGGTATTGAT 
TGCCAAGGTATTGATGCCCCAGAAATAGGGGACGGCGTATTCGTTGCCCGGATCGACGGC 
TTCCATCATTTTCAGCAAATCTTTATCGATGTTGCCGTAATGGGGGATTTGCGCCTTGTC 
GATTTTCTGATACGCGCCCGCTTTGATTTGCCGGCCGACGTTGGCGATGGACGGCGCGGT 
55 CAGGTCGTAGCCGGATTTGCCGGTCAGGACTTTTGCCTCCAGTGTTTCGTTGCTGTCGTA 
ATAATCGGAACGCGTCTTGATGCCGGTTTCTTTTTCAAAGGCGGCAACGGTTTCGGGATC 
GACATAATCCGACCAGTTGTAGATGTTGAGTTTGCCCGATTGTTCGGCTTCGGGCTTGGC 
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GGAGGGGGTTTGGGCGGCGGTATCGCTTCCGCCGCCGCACGCAGTCAAGGCGAGGCTCAG 
GATTGCCGCCGCCACCAGTGTTTTTTTCATGGTTTTCGACCTTTCGTTCAAAAATAAAAA 
ACATTGCAGGGACGGACGGGCCGCCTGCATCGGAAGCTGTGGGGGCGGCCGGTCGGCTTT 
CTTGGCCCTTGGTGCAGCAAGGTTTGCATTAAACGGCAAAAACAGCCGGGGCGCAACCGT 
5 TAATTTTCACCGGTTTGCCGTTCCGGTGCCCGATCGGGCGGTAATGCGGGACATTGTTCC 
GCCTGCGTAAAAATGCCGTCTGAAGCCTCGGCGGCTTTCAGACGGCATCGGGGCGGACGG 
CATCAGTTGCTCAACACGTCCACGCCTTTGGGCGGGGTAAACTTGAACGCGCCGCGCGAG 
AGTTGGGGATTGGTATTCAAACCGCCGAAACTGATGGAGGTTTGGTTGCCGAAGCTGTCT 
TTAAGCTGCATGGCGGCGAGGTTGCCGCCTTTGAAGCCGATGCGGATGTATTGGTAGCCG 
10 GCGTTGTTGCGTTTGGGCGTTGCCAGCACATAATCGATGCCGTTGGACGAACCGTCCTCT 
TTCAGCGTGTAGCTGCTTTCGAGGGCGGTTTTGTTCGACAGGATGGCGGCGGGGCTGCCG 
CCTATGGCCTGGTCTTGGGACGACTTGGTCACTTGTGCCAGATCAACATCGTAGAGCCAA 
ACGGTTTGACCGTCGCCGACGATGGTTTGCCTGTAAGGTTTGGTGTATTCCCATTTGAAA 
AGGCCCGGTCGCAGGATTTTGAACGTGCCGTGCGCGGTTTGGGTTTTCTTTTTGCTTTGG 
15 ACGGTTTGGGTGAAGCTGCCGCTGATACCGTCGGCATCGTTGTTGAATTGCTTAAGCGCG 
TCTACCGCGCCCGCCTGTGCGGAAGCGACGGCGACGGTCAGGGAGCAAACGGCGAGGAAT 
TGGAACAGGTTGTGCGGTTTCATCATTATTTTTCCTTGTCGGGATGAGTGTGGCGTAAAG 
TATCGGCGCAGCAAAACAATCATACGGGCGGCGTTACGGGCGGTTTGCATTTTGCAAACC 
GCGTTTTCCGAGGGCTGATTTTTTGCCCACCGGAAAAGGCGGCGCGCCACGCTGCCCTTT 
20 AATGTGTGCCGCGTTATAGTGGATTAACAAAAATCAGGACAAGGCGACGAAGCCGCAGAC 
AGTACAAATAGTACGGAGCCGATTCACTTGGTGCTTCAGCACCTTAGAGAATCGTTCTCT 
TTGAGCTAAGGCGAGGCAACACCGTACTGGTTTTTGTTAATCCACTATATTTGACGGTTT 
GGGCGCGGAATGCCGAATCGCGGAGATTATATGCCTCATCTTGTTTTTTTTAAACGCTAT 
AATATTTGTTTTCCGAACACGGCGGACTTGAGATGAAACCCGATATTTATGCTTTGCTGG 
25 AACGCGCCCTGCTTTCGGGCGACCCAGATGAAAAAGGACGGCTGACGGATGAGGCGTTTG 
CCGCCGTTCAAAATGCGGACGGGGCGGAAACAAACGCACCGCCGGCGGACTTCCCCCGCG 
CGGGACGACCGGACAAGCCTGTTTTGGTCGCGCCGTCGCAGCTGACGCCACGCAAAATGA 
ACACAACCGAAGGCTATGCGGCGATGCTGCACGCGATTGCGCATATCGAATTCAACGCCA 
TCAATCTGGCTTTGGACGCGGCATACCGTTTCCGCACGCTGCCGTTTCAGTTTGTCCGCG 
30 ACTGGGTGAAAGTGGCGAAGGAAGAGGTGTACCATTTCCGCCTGATGCGCGAAAGGCTGC 
GCGCTTTCGGCTTCGATTACGGCGATTTTGAAGCACACAATCATTTATGGGATATGGCAT 
ACAAAACCGCCTACGATCCTTTGTTGCGTATGGCTTTAGTGCCGCGCGTTTTGGAAGCGC 
GCGGGCTGGACGTTACGCCCGGCATACGCGCGAAGGTGGCGCAGCGCGGTGATTCGGAAA 
CCTGCGGCGTGTTGGACATCATTTACCGCGACGAAGTGGGACACGTCGCCATCGGCAACC 
35 GGTGGTATCAACACCTTTGCCGCGAACGCGGTTTGGAGCCTGTCGCCCTGTTCCGCAGCC 
TGATTGCCCGTTACGATATGTTTATCTTCCGGGGCTATGTGAACATCGAAGCGCGCGAAA 
AAGCAGGCTTCAGCCGCTTTGAATTGGATATGTTGGAAGATTTCGAGCAGGGTTTGAAAC 
AAAATAAACATGCCGTCTGAAACCCTTCGTCCCGCACTTTATAAAAjyVGGAACACACATG 
ATACAAGCCGTATTGTTCGACCTCGATGGCACGCTCGCCGACACCGCCCTAGACCTCGGC 
40 GGCGCACTCAACACCCTGCTCGCCCGCCACGGACTACCTGCAAAAAGCATGGACGAAATC 
CGCACCCAAGCCAGCCACGGCGCGGCAGGACTGATCAAGCTCGGCGCAGGCATCACCCCC 
GACCATCCCGACTATGCCCGATGGCGCACCGAATACCTTGACGAATACGACAGCCGCTAC 
GCCCAAGACACCACCCTCTTCGACGGCGTAAACGAACTCATCGCCGAACTCGGAAAACGC 
GGCATCAAATGGGGCATCATCACCAACAAACCCATGCGCTTCACCGACAAACTCGTCCCC 
45 AAACTCGGCTTCATCATCCCACCCGCCGTCGTCGTCAGCGGCGACACCTGCGGCGAGCCC 
AAGCCCAGCGTCAAACCCATGCTGTATGCGTGCGGACAAATCCACGCCGACCCGCAACAC 
ACACTCTACGTCGGCGACGCGGAACGCGATATACAGGCGGGGCGCAACGCCGGTATGACG 
ACCGTCCTCGCCGAATGGGGCTACATCGCTCCCGAAGACGATACCGGCTCATGGCAGGCG 
GATTTCCACATCCGCACGCCACTCGATCTGCTCGAATGTCTGGACAAAATACAGCCCTGA 
50 AAAATATCCGCCCCACAAACATATAGTGGATTAACAAAAACCAGTACGGCGTTGCCTCGC 
CTTAGCTCAAAGAGAACGATTCTCTAAGGTGCTGAAGCACCAAGTGAATCGGCTCCGTAC 
TATTTGTACTGTCTGCGGCTTCGTCGCCTTGTCCTGATTTTTGTTAATCCACTATAAAAC 
TGCCGTCTGAAACCTGATTTCAGACGGCAGTTCCGCCTTCAAACCGAATCAAAGCCCGTC 
AAAACCTGCGTTTGAGCTTGCACGCCTGAAGGATGTGTACCGCCAATTCCTCAACCGACT 
55 TATCCGTCGTATTCGCAAACGGAATCCCATGCCGTCTGAACATACTCTGCGCGTCCGCCA 
CCTCGCTGCGGCATGTATCGATTTTGGCATAAGTTGAATTCGGGCGGCGCTCTTGGCGGA 
TGGCCTGCAAACGTTCCGGCTGGATGGTCAACCCGAACAGCTTATCCCTATAAGGCTTGA 
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CCATACGCGGCAGATCGGCCGATTCCAAATCGTCGGGAATCAGCGGATAGTTTGCCGCAC 
GGATGCCGTATTGCAGGGCGAGGTAAAGGCAGGTCGGCGTTTTGCCCGAACGCGATACAC 
CCATCAAGATTACATCCGCTTCCTGAAGGTTCTTATCGCTGACCCCGTCGTCGTGGTTCA 
AAGAAAAATTGACCGCTTCCATACGCGCATCATAACGCTTCGTATTACCGATACTGTGAT 
5 GCCCCTGCCCGGATGCCGTGGCTTCGGTATTGAGTTCCTTCTCCAACAGTCCCAAAAAAG 
TCTCAAAGAAATTAATCTGAAAAGCATCCGCCCCTTTGATAATCCGACGGATTTCGTCAT 
CAACCACACTGACAAACGCAATCGGACGCTGACCGTTTTCCTGCCGGCTCCGATTGACCT 
TCTCCACCACCGCGCGCGCCTTTTCCGGCGTATCGACAAACGGATGCGTATGGCGTTTGA 
ACGACAGATTGCCAAACTGGTTCAGCAACGCCTCGCCGATATTCTCAGCAGTCAGACCGG 
10 TACGGTCGGAAATGTAAAACACATGGCGCGGACTGCTCATCTTCCATCCTTAAAACACAG 
GTTTAAAATCCCATCATAACAGCAGCAGCAGACACAAGGAAAGCACCCGCAGCACACCTA 
CCTCGGATTACACCCAAACAGACACAATATAATTTTGAAATAAAATTATTTATATAAAGT 
ATTTTTTGGCAGAAAATTTTAAAAAATAAACAAAAAAATCAAACAGAAAAAACATTAACC 
TATTCAAACCACCTGTTTTACAAAGAAAATACCCAAAAAAAGAAGTATACCGGCTGTAAG 
1 5 TTTCAAACCGCTACACACGCCGAAACCGCAATTTTTCAGACGGCATCATGATTTTAAAAC 
GGATAAAACACATGACAGCAGAGGAACGAATCGCTTAAAATAAGCACGCGGATTTGTTTC 
TTTTTTAACATATTTTGGATTGGACACACAAATGGCCGACAACTACGTAATCTGGTTTGA 
AAACCTGCGTATGACAGATGTTGAACGCGTGGGCGGTAAAAACGCCTCGCTGGGCGAAAT 
GATCAGTCAGCTGACCGAAAAAGGCGTTCGCGTCCCCGGCGGCTTTGCCACCACGGCCGA 

20 AGCCTACCGCGCATTCCTCGCACACAACGGTCTGAGCGAACGCATTTCCGCCGCACTGGC 
AAAATTGGATGTCGAAGACGTTGCCGAACTGGCACGCGTCGGCAAAGAAATCCGCCAATG 
GATTTTGGATACGCCTTTCCCCGAACAGCTCGATGCCGAAATCGAAGCGGCATGGAACAA 
AATGGTTGCCGATGCCGGCGGTGCGGACATTTCCGTTGCCGTACGTTCTTCCGCAACTGC 
CGAAGACCTGCCGGACGCATCATTCGCTGGACAACAGGAAACCTTCTTGAACATCAACGG 

25 CTTGGATAACGTTAAAGAAGCGATGCACCATGTATTCGCTTCCCTGTATAACGACCGTGC 
CATTTCTTACCGTGTCCACAAAGGCTTCGP-ACACGACATCGTCGCCCTTTCCGCCGGCGT 
TCAACGCATGGTGCGTTCCGACAGCGGCGCATCAGGTGTGATGTTCACCCTCGACACCGA 
ATCCGGCTACGATCAAGTCGTCTTTGTTACCTCCTCTTACGGTCTGGGCGAAAACGTCGT 
ACAAGGTGCGGTCAACCCGGACGAATTTTATGTGTTCAAACCCACGCTCAAAGCGGGCAA 

30 GCCCGCCATCCTGCGTAAAACCATGGGTTCAAAACACATCAAAATGATTTTTACCGACAA 
AGCAGAAGCCGGTAAATCCGTAACCAACGTCGATGTCCCCGAGGAAGACCGCAACCGCTT 
CTCCATTACCGACGAAGAAATTACTGAGTTGGCGCATTACGCACTGACCATCGAAAAACA 
CTACGGCCGCCCGATGGATATCGAATGGGGACGCGACGGCTTGGACGGCAAACTCTACAT 
CCTGCAAGCCCGTCCCGAAACCGTAAAATCCCAAGAAGAGGGCAACCGCAACCTGCGCCG 

35 CTTCGCCATCAACGGCGACAAAACCGTATTATGCGAAGGCCGCGCCATCGGTCAGAAAGT 
CGGTCAGGGCAAGGTGCGCCTGATTAAAGATGCTTCCGAGATGGATTCCGTCGAAGCCGG 
CGACGTACTCGTTACCGACATGACCGATCCGGATTGGGAACCCGTGATGAAACGTGCTTC 
TGCCATCGTTACCAACCGCGGCGGCCGTACCTGCCACGCCGCCATCATCGCGCGTGAATT 
GGGCATTCCTGCCGTTGTCGGCTGCGGCAATGCAACCGAATTGCTGAAAAACGGTCAAGA 

40 AGTTACCGTATCCTGTGCCGAAGGCGATACCGGCTTTATCTATGCCGGTCTGTTGGACGT 
ACAGATTACCGATGTCGCCTTAGACAATATGCCTAAAGCACCTGTAAAAGTCATGATGAA 
CGTCGGCAATCCCGAACTCGCATTCAGCTTCGCCAACCTGCCCAGCGAAGGCATCGGCTT 
GGCGCGTATGGAATTTATCATCAACCGCCAAATCGGTATCCACCCCAAAGCCTTGTTGGA 
ATTTGACAAACAAGACGACGAATTAAAAGCGGAAATTACCCGCCGTATCGCCGGTTACGC 

45 GTCCCCTGTCGACTTCTACGTCGATAAAATCGCCGAAGGCGTGGCGACATTGGCCGCATC 
GGTTTATCCGCGTAAAACCATCGTCCGTATGTCCGACTTCAAATCCAACGAATACGCCAA 
CCTGGTCGGCGGCAACGTATACGAACCGCATGAAGAAAACCCGATGTTGGGCTTCCGTGG 
TGCGGCGCGTTATGTCGCCGACAACTTCAAAGACTGTTTCGCCTTGGAATGCAAAGCCTT 
GAAACGCGTCCGCGATGAAATGGGGTTGACCAACGTTGAAATCATGATTCCGTTCGTCCG 

50 CACTTTGGGCGAAGCCGAAGCCGTTGTCAAAGCCCTG.'^GAATVACGGCTTGGAACGCGG 
CAAAAACGGCCTGCGCCTGATTATGATGTGCGAGCTGCCGAGCAACGCGGTATTGGCGGA 
ACAATTCCTGCAATACTTCGACGGCTTCTCCATCGGCTCGAACGACATGACCCAACTGAC 
CCTCGGTCTCGACCGCGACAGCGGCTTGGTATCCGAATCGTTTGACGAACGCAACCCTGC 
CGTCAAAGTGATGCTGCACCTTGCCATCTCCGCCTGCCGCAAGCAGAACAAATATGTCGG 

55 CATCTGCGGTCAAGGCCCGTCCGACCATCCGGACTTCGCCAAATGGCTGGTTGAGGAAGG 
CATTGAAAGCGTTTCCCTGAACCCGGATACCGTCATCGAAACTTGGCTATATTTGGCGAA 
TGAATTGAACAAATAATCAATGCCCATACCCCCGAGCCTGAAAAGCGCGGGGGTATTTTT 
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TTCCAAGCAGCTCCGCTCAGACGGCATTTCCTGCCGATGCCCCGTCCGCGATAATATTTG 
ACACCCACGCGCCGACTGCCTACAATTCCCCCTTCCCCGAGCAACCGGCAACGGTCAGCT 
TCTTCTTTCAGACGGCATCTGCCTGTCTTTTTCCTCTTCAAAATACATCATTATTATGCA 
CGTCTCCGAATTACAAACCCTGCACATTTCCAAACTCTTAGAATTGGCGGAAGAACACGG 
5 CATCGAAAACGCCAACCGATTCCGCAAACAAGACCTCGTATTTGCCATCGTCCGCCAGAT 
GATGAAAAAAGGCGAGGGTTTCACCTGCTCCGGCACGCTTGAAATCCTGCCCGACGGCTT 
CGGCTTCCTCCGCAGCGCGGACACGTCCTATCTTGCCGGCCCCGACGACATCTATGTCTC 
GCCCACCCAAATCCGCCGCTTCAACCTGCATACGGGCGACACCATCGAAGGAAGCGTGCG 
CGTCCCAAAAGACAACGAACGCTATTTTGCCCTGGTCAGGCTTGATACCATCAACGGCGA 

1 0 CCACCCGGAAGTATGCCGCCATAAAATCCTGTTTGAAAACCTGACCCCGCTGTTTCCGAC 
CGAACAGTTGAAGCTGGAACGCGACTTAAAGTCCGAAGAAAACCTGACCGGACGTGCCAT 
CGACCTGATTTCCCCTATCGGCAAAGGTCAGCGCGCCCTCTTGGTTGCCCCGCCCAAAAG 
CGGTAAAACCGTGATGCTGCAAAACATTGCCCACGCCGTTACCGCAAACTATCCCGAAGT 
CGAACTCATCGTCCTCTTGATTGACGAACGTCCCGAAGAAGTAACCGAAATGAGCCGCTC 

1 5 CGTCCGTGGCGAAGTAGTCTCCTCCACCTTTGACGAGCCGGCTCAACGCCACGTCCAAGT 
TGCCGAAATGGTGCTTGAAAAAGCCAAGCGTATGGTGGAACACAAAAAAGACGTGGTCAT 
CCTGCTGGATTCGATTACCCGCCTTGCCCGCGCCTACAATACCGTCGTGCCTACCTCGGG 
CAAAATCCTGACCGGCGGTGTCGATGCCAACGCGCTGCATCGTCCCAAACGTTTCTTCGG 
CGCGGCGCGCAACGTGGAAGAAGGCGGTTCGCTGACCATCATCGCCACCGCATTGGTTGA 

20 AACCGGCAGCCGTATGGACGATGTGATTTACGAAGAATTCAAAGGCACCGGCAATATGGA 
ATTGCACCTTGACCGCCGTATGGCGGAAAAACGCCTCTTCCCCGCCATCAACATCAACAA 
ATCCGGCACGCGCCGCGAAGAGCTGCTTGTCCCAAACGACCAGTTACAACGTATGTGGCT 
CTTACGCAAGTTCCTGCACCCGATGGACGAAATCGAGGCAGCCGAATTTTTAATCGGGAA 
AATCAAAGCCTCTAAAAACAACGACGATTTCTTTGAACTGATGCGCGGCAAATAAACGCG 

25 CCGCCCGCATTAATGCGAAATGCCGTCTGAAGCCTGAAAATCGGGTTTCAGACGGCACTT 
TCATTCACACGGTCGGCGCAGCTCCTCCCTCCCCCCGTTAACGGCGCAACCGTCGGCAGT 
GTCCGTGTCCGCTTGCCGAAAGCGCGGCCTTTGCAAAGCCGGCTTGAACGCATCCGTACC 
GGCAATGAAAACCGATGCGGGAACTTGATTCAAGGTTGCGCCGAACCGCACTCATTTTTG 
TATAAATTTGGGGCTGTCCTAGATAACTAGGGAAATTCAAATTAAGTTAGAGTTGCCCCT 

30 ATGAGAAAAATTCGTCTAAGCCGGTATAAACAAAATAAACTCATTGAACTGTTTGTCGCA 
GGTGTAACTGCAAGAACAGCAGCAGAGTTAGTAGGCGTTAATAAAAGTACCTCAGCCTAT 
TATTTTCATCGTTTACGATTACTTATTTATCAAAACAGTCCGCATTTGGAAATGTTTGAC 
GGCGAAGTAGAAGCAGATGAAAGTTATTTTGCTGAACGACAAAACCATATCAATGGAATT 
GGGAACTTTTGGAACCGGGCAAAACGTCATTTACGCAAGTTTGACGGCATTCCCAAAGCG 

35 CATTTTGAGCTGTATTTAAAGGGGTACGAACGACGTTTTAACAACAGCGAGATAAAAGTT 
CAAATTTCCATTTTAAAACAATTAGTAAAATCGAGTTTATCCTAGTTATCTAGGACAGCC 
CCATAAATTTTTATAGTGGATTAACAAAAACCAGTACGGCGTTGCCTCGCCTTAGCTCAA 
AGAGAACGATTCTCTAAGGTGCTGAAGCACCAAGTGAATCGGTTCCGTACTATTTGTACT 
GTCTGCGGCTTCGTCGCCTTGTCCTGATTTTTGTTAATCCACTATATTTCCAATTAAAAA 

40 TTTTTAATATATTAATCAATAAATTAATTTTATAAAATAAAAATATTGTCAACAATATTT 
TGCCTTATCGCCCAAACCTCTGTATATTTTCCTACAGTAAATTGTTGACAATCCATACGC 
CCACATATGCGCCGCCTAAGGATAAATCCTCCCGCCGGACAACGGGTGCAAGGGATCGGA 
TGCGATATTTCCATATTCAAACAAGGGATTGGCGGCACATCGGCAAAATCCCCGCGCCGC 
CCCGGTCCGGCAGGGCTTGCGTCCCTCCCGGACAAGCCCCGACCCCGCCTTTCCGAAAGA 

45 CGGGCTCAACCATTAAGGAAACTTTAGTCAAAATGAAAAAACACATATGGGCGGCATCTT 
TGCTGCCGGCATCCCTATCGGCAGAACCTTTAAACTGGTGGAAGCCTTATTCCGCCGTCA 
ATTCGGGCGATACCGCCTGGGTGATGACTGCGGCTGCCTTGGTACTGTTGATGACGCTTC 
CCGGGCTGGCTTTATTCTACGGCGGTATGGTGCGGAAAAAAAACCTGCTCTCGACGATGA 
TGCACAGCTTTTCCATCGCGACATTGGTGGGCATCCTTTGGGTCGCCGTCGGCTATTCTT 

50 TAGCGTTCACGCCGGGAAATGCCTTTATCGGCGGTTTGGGGCGCGTATTTTTAAGCGGGA 
TGCAGATAGACGCTACCGCACAGATGCTGACCGTGTCGCCCAATGCGCCGACTGTTCCCG 
AACCGGTATTTATGTTTTTTCAGATGACGTTTGCCATTATTTCGACCGCCATTATTACCG 
GCGCGTTTGCCGAACGGATGAAATATTCGGCAATGATGCTGTTTTCGGGCATATGGTTTT 
TATTGGTTTATGTGCCGGGCGCGCATTGGGTGTGGGGCGGCGGCTTTATGAGCAAGGGCG 

55 GCGTATTGGATTATGCCGGCGGTACGGTGGTGCACATCAATGCCGGTATCGCGGGACTCG 
TCGCCGCCTTGGTTTTGGGCAGGCGCATAGGCTACGGGCGCGAGGCGATGCCTCCGCACA 
ATATGGCGATGACACTGATCGGCGCGGCAATGTTGTGGTTCGGCTGGTTCGGCTTTAACG 
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CCGGATCGGCGCTTGCGGCAGACGCGGCGGCGGGTATGGCGATGGCGGTAACGCAGGTGT 
CGGCCGTATTCGGCGCGGCAGGCTSGCTTGCCTGCGAAAAAATAGCGGGACACAAACCTT 
CCGCTTTGGGGCTGGCTTCCGGCGCGGTTTCCGGTCTGGTCGGCATCACCCCTGCCGCCG 
GCTTTACCGGCCCGTCGGGCGCGGCCGCCATCGGTATATTGACTGCCGCCGCGTGCTTTG 
5 TGTCCGTCACCGTCGTCAAACACAAATTGCGTTACGATGATTCTTTGGACGCTTTCGGCA 
TACACGGATTCGGCGGGCTGGTGGGCGGAATATTGACCGGCATCTTTTTCGACAACCGCA 
TTTTCGGCGGGGATGCGGCAGTTT3GCAGCAGTTGTGGATACAGGTAAAAGACGGGGTCG 
TTATGGCGGCATACAGCGGGCTAATGAGTTGGGCGATTTTGAAGGTCGTGGGGAAAATCT 
GCGGCGGCCTGCGCGTCGGCAAGGATGTCGAACGCGAAGGTTTGGATCTGAATATCCACG 
10 GCGAACGCGTGGAATAAGGGCGGCTATGCCGTCTGAAGCCTGAAAATCGGGTTTCAGACG 
GCATTTTTCACGTTTGCCGCCGATGGATAAACATATAGTGGATTAACAAAAATCAGGACA 
AGGCGACGAAGCCGCAGACAGTACAGATAGTACGGAACCGATTCACTTGGTGCTTCAGCA 
CCTTAGAGAATCGTTCTCTTTGAGCTAAGGCGAGGCAACGCCGTACTGGTTTTTGTTAAT 
CCACTATACTGCCTGCGACGCTTAACGGCTGTCTTTCCACTGATAATATTTCGAGCCGAG 

1 5 GAAAGAACCGAGTTTGCGCAAAAACGGTTGCAGGATAATATTCGGCTGGCGCAACCGCTC 
AAACGGCTTCAGACGGCATTCGTCCCCTAAAATTGCTTCGGCAACCGCCAGACCTGCAAT 
GCCTGTTATCGCCATCCCGTGTCCGGAATAACCTTGCGCATT^AAAAACATTCGGGGCTAA 
ACGTCCGAAATGCGGGACAAGGTTGGCGGTAATGTCGCACTCCCCGCCCCACGAATATTC 
GATTTTGACATCGGCAAGCTGCGGAAAAACTTTAAGCATATCTTGGCGGACAAGCTCGGT 

20 CATACGCTCAGGATTGTCGATAAACTCGTTATCCTTACCGCCGAAAAGCAGTCTGCCGTC 
CGCGCTGAGGCGGTAATAATCCAAAATATGGCGGTTGTCGCATACTGCCATATTGTTACG 
GATAAGCCCTTTTGCGCGCGCCCCCAAGGGTTCGGTCGCAATAATAAAGGTGCTGACAGC 
AATCGCCTTGCGTTCCAAAGGCCGGAATATCGGGTTCAAACCTGCATAAGTATTGACAGC 
ATAGACCACATTTTTGCACTCGACGCTGCCTTCGGGCGTGTAAACCAGCCAACCGTTTTG 

25 ATGCGGTTCGATGCACGTCATCGGGGATTGCTCGAAAATCTGCGCACCGGCTTCGGCAGC 
GGCACGAGCGATGCCCAAAGTGTAAGTGAGCGGATGCAGGTGTCCGGATAAGGGGTCGAA 
TTGTGCCCCTTGGTACATATCGCTGTCAAGCTGCTGTTTCAACTCGGCTTTATCCCAAAG 
TTGATAATGACTCGCACCGTAATGCCGTTGGGCGTGTTCATGCCACTGCTGCAACTCTTC 
CCAATGCTGCGGACGGACGGCAACCGTGGCATAACCGCGCTGCCAATCACAATCGACGGC 

30 ATGTTTGCGGACGCGTTCGTCCACCAGTTCGACCGCCTGCAAAGACTGTTGCCAAAACCA 
TTGCGCCTGCTCCAAGCCGACCTGTTTTTCAATTTCCCCCATACCGCAGGCGTAATCGCT 
GATAACCTGCCCGCCACTCCGTCCCGACGCGCCGAAACCGATACGCGCGGCTTCCAACAC 
AACCGTTTCATGTCCCTGCTCCGCCAAGGGCAATGCAGTGCACAAACCACCCAATCCGCC 
GCCGATGATACAGGTATCGGTTTTCAGACGGCATTGAAGTTTCGGATAAACAGTATGAGG 

35 ATTAACCGAACTGAAATAATAAGAAGGCAGATATTCTTGAAAATCAGGGCGAATCATTGT 
GTTTGCTTTATCAGGTGTATTTTCGGACGGAATGATACAGGCTGTCGGGCCATATCGTCC 
AAACAGAAAATCGGTTGAAGAAAACAGGCTGACCCAGTCATGCGGTCAGCCTGCCTTATT 
AATTAATTTGCTTTCTCGGCAGCCAATTTTTCCTGGCGGTAGGCTTCTGCCGCTTCTCGG 
TCACGCTTGGTTGCCTGCCTCATCATCCAATAATTGACGATGATGACCAATGTTCCGATG 

40 ATGCCGATTAGGATGGTCGCCAAGACATTCATCTGAGGATCGAGACCCAACTTGATTTTG 
GAGAAAATCACCTGCGGCAATGTGGATGAACCGGGGCCGGAGAGGAATGAGGTAATCACC 
AAATCATCCAAAGACAGGGTAATGCCGAGCAGAAAGCCTGAAGCGATGGCAGGGGCAATC 
AAAGGCAAAGTGATGACAAAAAAGATTTTCAGCGGGCGCGCGCCCAAATCCATTGCGGCT 
TCTTCGAGCGACTGGTCAAGCTCAJ\CCAGACGCGAACGGATAACAACGGTAATGTACGCC 

45 ATACACAGCGTCGTATGTCCGAGGAAGATGGTGAAAAAGCCACGATCGAAGTAGAGATGT 
TGTAACCATTCGCTGCCCTGCAAAAATATCTGTACCTGAATAATCAGCAGCAGCATAGAC 
AGACCGGTAATCACGTCGGGCATCACCATAGGTGCGGAAATCATGCCAGCGAACAAGGTA 
CTGCCGCGAAAACGTTTAATCCGCGCCATCGCATAGCCTGCCAGCGTGCCCAAAACGACG 
GCGGCAAGCGAAGACACAACGGCAJ\TCCGCAGCGACAGCCAAGCGGCTTCCAAGATGGTG 

50 TCGTTTTCCAGCAATGCGCCGTACCACTTGGTCGAAAAGCCGCCCCAAACGGTTACCAGC 
TTGGATTCGTTAAACGAATAGATGACCAAAACAACCAGCGGGATATACAGAAACGCCAGC 
GACAGTGCCAACATCAGTTTCAAGAACCAAGATAATTTGGATTTCTGCATTATTTGGCTC 
CTTCTTCCAATTCGCGGTTTTCATAATGCTGAAACAGGGCAATCGGCACGACCAGCAGCG 
CGACCATCACGACGGCGACGGCGGAAGCCAGCGGCCAGTTGTTTTGATCGAAGAACGCCT 

55 GCCACAAGACTTTACCAATCATCAGGTTTTCCGAACCGCCGACCAGCTCGGGAATGACGA 
ACTCGCCGACAGCAGGGACGAAAACCAGCATGGAGCCTGCAATAATGCCGGTTTTCGACA 
AAGGCAGGGTAATCGTCAAGAACGATTTGACCGGCCCCGCGCCCAAATCGGAAGCCGCTT 
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CAAGCAGGCGGTTGTCGAGTTTCACCAGTTGCGTGTATAGCGGCAGAATCATAAACGGCA 
GATAGGCGTAAACCATCACCAAATTGAGCGAAAAGGCATTGTAGAACAAATCCAAAGGCT 
CGCTGATAATACCCATTTTAATCAACAGGTTGTTTACAATGCCGTTATGCCCGAGCAGAC 
CCATCCACGCATAGACGCGCAACAGGAACGATGTCCAAAAGGGCAGCATAATGGCAAGCA 
5 GCAAACCATTGCGGACAGAAGGATTGGCACGAGAAATCGCATAGGCGGTCGGATAACCGA 
CCAACAGACAAATTACCGTCGTAGTCAGCGCAGTCTTAATTGAAGACCAATAAGTCATCA 
GATAGATATTGCTGTTTTCACCGTCGCCGAACGGATTGAGCGTACTCCAAAAATTTTGGA 
AGATGTCTGCATAATTTTGGTAGCTGACAGCAATATTCAGACGACCCAAATCCTCATCTA 
TCGTCGTTAAAGGAGTAAACGGCGGGATGGCGATTTCTTGTTCGGCAAAGCTGATTTTCA 
1 0 GCACGATGGCGAACGGAATCAGAAACAGCACCAAAAGCCAAATATACGGTACGGCAATCA 
CCGCACGCTGCCCCGGACGGCGGAACAGTTTGTTTTTCAGTTTATTAJVGGTTCATTGCAT 
TCCCCTTAAATCAACGGAACAACGGAGTCGGTTGGTTTTCCGGCCAGCTGATATAGACGG 
TTTCGTCCCAAGTCGGCGGTGTAATGTTGCGCACATACCAGTAAGGGGCGGGGACTTGGC 
TTTTGACGACGCGCCCGTTGCCGAGCTTGATATGGTAAATGGCGAAGCTGCCCAAATAGG 
1 5 CGATTTCTTTTACCGTGCCTTTCGCCCAGTTGTAGTCGCCCAAATATTCGGGTTTTTCTT 
TATATAAATCAATATCCTCTGGTCGAATACTAACCCAAAGGTCCTGCTCGCTCGGACCAC 
CCAAACCGTGATCGATGCGGACGTGGTTTTCCAAACCTTCGCATTCGATAACGGCATAGT 
CGGCATGATCTTCAATCACCACACCGTCAAAGATGTTGGTTTCGCCGATAAACTCGGCAG 
TGAAGCGGCTGTTGGGATAGTCGTACACGTCGCTGGGTGTGCCGACTTGCTGCAACTGAC 

20 CGTCAGACATAATGGCGATGCGGGTCGCCATCGTCATCGCCTCTTCTTGGTCGTGCGTAA 
CCATAATACAGGTTACGCCGACTTGTTCCAGCGTATTGACCAACTCAAGCTGGGTTTGTT 
GGCGCAGTTTTTTGTCCAATGCACCGAGGGGCTCATCCAGCAGTAGAATTTTCGGACGTT 
TTGCCAGACTGCGTGCCAAAGCAATGCGCTGCTGCTGACCGCCGGACAATTGGTGCGGTT 
TGCGTTTAGCAAATTTGGTCATCTGAACCAGGCGGAGCATTTCTTCGACGCGCGCGGCGA 

25 TTTCGCCTTTAGGCATTTTGTCCTGTTTCAGACCGAAGGCAATGTTTTGTTCTACGGTCA 
TATGCGGAAAAAGCGCGTAACTTTGGAACATCATATTGATGGGGCGATCATAGGGTGCAA 
GTTTGGTAATATCCTGACCATCAAGGATAATTTTTCCCTGATTGGGACTTTCCATACCCG 
CCAGCATACGCAGCAGTGTAGATTTTCCGCTGCCGGAACTGCCCAAAAGGGCGAAGATTT 
CGTGTTGATAAATGTCCAAGTCGATGTTATCGACAGCGTAATTGTCACCAAACTTTTTCA 

30 CCAAACCTTGGATTTTGAGATAAGGTTTGGCTGAAGACGCAGTGGTTGCGGTCATAATGG 
CAATACTCCAATAAAAAGACGAGTACCGGCAAAACGGATTTTCGAATGGGTGATAAAAAG 
CTGTTTGATTGCTGGCGGGAGTTAAACGTTTGATGCCGTCTGAAACTCTTGTAAAGCGCA 
CGGGCAGCATGAAATGGAACAAGATTCCAAAGAACTTTATATTATATTAGTTTATGCGGT 
TTTCGGGCAATATAGTGGATTAAATTTAAACCAGTACAGCGTTGCGTTGCCTTGCCGTAC 

35 TATTTGTACTGTCTGCGGCTTCGTCGCCTTGTCCTGATTTAAATTTAATCCACTATAAAT 
ATGGATTTGAGCTTGTCGGAAAGCAACAGAAAAGAAAACACCGCCCATTTTTCTGGGCGG 
TGTCGGAAAGCGTAATTATTTACGCAGACCCAAGCGGGTAATCAACGCGCGATACGTATC 
GGGCTGGGTACGGCGCAAGTAGGCCAGCAGGCGGCGGCGTTGGCTGACCATTTTCAACAG 
GCCGCGACGGCTGTGGTGGTCTTTGGGGTTGGCTTTGAAGTGGGGGGTCAGGTCGTTGAT 

40 GCGGAT^GTCAACAGAGCGACTTGTACTTCGGAAGAGCCGGTGTCGCCTTCTTTGCGTTG 
GAAATCTTTAACGATTTGTGCTTTTTGTTCTACGGTCAGTGCCATAATGAAAACTCCAAA 
AATATAAGAATCCCCATAGGGATTCAGACAAGTTTGCCAAGCCTGAAGACAACGGCAAAC 
TCCCTATGCCCAAGATAGGAC7\ACGTGGCATTATGACACAATTCCCGCATTTCTGCACAA 
TATTTTAAGACCGTGGAAGTAACCGTTAAAAATGCCGTCTGAAGCATTGTCTGCTTGAGA 

45 CGGCAATGTTCAGACGGCATATGCGCTTCAAACCACGATTTCTTCTTTCTTCTTCGGTTC 
TTTACCGATATTGTCGCGGCTCAAACCGAACATTAGCAGAAGCGGGCTGGCAACCAATAC 
GGAAGAATAAATGCCGAACACGATGCCAATGGTCAACGCCATAGAAAAGCCGTGCAAGGC 
CGCACCGCCGAACACCAGCATGGATACGACCATCGCCTCGGTCGAACCGTGGGTAATGAT 
GGTGCGGCTCATCGTTGCGGTAATCGCGTTGTCGATGACTTCCGGCACGGCATGTCCGCG 

50 CATCGCCGGCTTGCGGAAGTTTTCACGGATACGGTCGAAGACGACGACGGATTCGTTCAC 
AGAATAGCCCAATACGGCAAGGATACCCGCCAAGACGGTCAGCGAAAATTCCCATTGGAA 
GAAGGCAAAGCAGCCGAGAATAATCACGATGTCGTGCATATTGGCGATAATGGCAGATAC 
GGCAAAACGCCATTCAAAACGCATCGACAGGTAAATAATGATGCCGATAACGACAAAACC 
TAAAGCCATCAATCCATTACTTACCAATTCCTCACCGACTTGCGGGCCGATAAATTCGAC 

55 TTGGCGCAAGGTAACGTCGGGACTGTCTTTTTTCAGCAAATCCATAACCTGATTGGACAA 
CTGTGCGGAAGTAACACCTTCTTTGTTCGGCAGGCGGATCATGATGTGTTTGTTCGTACC 
CAATGCCTGAACCTGTACATCACCTATTTTCAGCGTATCGAGGCGTTCGCGCATCTTATT 
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GACATCCGCACCCTGCTGATATTGGACTTCCATTACCGTACCGCCGGTAAATTCGACAGA 
GAAATTCAGACCTCTGGTAACCAAAAAGAACACGGCAGCGATAAACGTAACCAACGAAAT 
GAAGGTCGTCAGTTTGCCGTAGCTCATAAACGGAATATCGCGTTTGATTTTAAAGAGTTC 
CATAGCTTACTCCTTGCCTCCTGCCATTTCGGCTTTCGGCTTCCACACCGAACCAATGGA 
5 AATATTCTGCAATTTGCGTCTGCGTCCGTACCACAGATTGACCAACGCACGGAATACGAC 
GACGGATGAATACATCGAAGTCAGAATACCCAAACAGTGTACGACCGCAAAACCGCGTAC 
CGGGCCGGAACCGAATACCAAAAGCGCGATACCGGCAATCAGCGAAGTCAGGTTCGAATC 
GACAATGGTCGCCCATGCGTGTTGGAAACCGAGATTGATTGCCTGCTGCGGCGGCACGCC 
GGCACGCAATTCTTCGCGGATACGTTCGTTAATCAAGACGTTGGAGTCGATTGCCATACC 
1 0 CAAAGTCAACGCCAGCGCGGCCATACCCGGTAACGTCAACGTTGCCTGCATGGCAGACAA 
AATACCGATTAGGAACAGTATGTTGGCACTCAATGCAATGGTAGAAAAGAAACCCATCAG 
ACGATAGTAAACCACCATGAATGCAGCAACGATGGCAAAACCCCATAAAGTCGAATGGAA 
GCCTTTTTCGATGTTCTCCTTACCCAAAGACGGACCGATGGTACGTTCTTCGACAATCTG 
CATCGGTGCGGCAAGAGAACCGGCACGCAACAGCAAAGACGTATCATTGGCTTCGGCTGT 
15 CGTCATGCTTCCGGAAATTTCCACGCGTCCGCCGGTAATGGCAGTACGGATAACCGGCGC 
GGTTACAACCTCGGATTTTCCTTGGTCGATCAAAACCATCGCCATGCGTTTGCCGACATT 
TGCGGCAGTCAGTTCGCCGAAAATGCTGCCGCCCGCGCTGTCCAAGCTCAGACTGACGGC 
AGGTGCGCCCATTTGGTCGAAACTCGGTTGCGCATCGTTGATGTTGTCGCCCGTCAGCTC 
GACCTGTTTGCTGATCAGCAGAATTTCGGGACGATCTCCGCCGCTTGAAAGCAGCTCATA 
20 ACCGCTCGGCACGTTGCCTTCCAATGCCTCGCGCAACTTGGCAGGATCGTCCTCCACCAT 
ACGCAATTCCAAAGTCGCGGTACGGCCGATGATGTCTTTTGCCTTGGCAGTATCCTGAAC 
GCCCGGAAGCTGCACGACGATACGGTCTGCACCGGACTGCTGGATGACGGGCTCGGCCAC 
GCCCAACTCGTTCACACGGTTGTGCAGGGTAGTGATGTTCTGTTTGACCGCATCGGAACA 
CACTTTATTGACCGCCTCTTCCGAAAGCGTCAAGACGATATTGCTGCCGTCTGAATTCAG 
25 CGTTGCTTCAGaAACAGCTTGCGCAACTGCGGCAGAGCCTTTTGCACATCACCTGCATCC 
TGCAAAGGGACGGTCAGGCTGTTTCCAGCCTGACGCACCGTGCCGCTGcGGATTTTTTCG 
CGGCGCAGTTCGCGGCGGATGTCGCCCGAATAACGTTCAAACGTTTTCTGCATCGTTGCT 
TTCATATCGACCTGCATGGTGAAATGCACGCCGCCGCGCAGGTCCAAACCCAAAAACATC 
GGATTGGCTTTGATTTTCGCCATCCATTCGGGGCTGTCCGCCAACAGGTTGAGCGCGGTA 
30 ATATACCCTTCGCCCAAAGTGTTTTCGATGACGTCGCGCGCTTTAAGCTGCGTTTCTGTG 
TCTTTGAAACGCACTTTCAGTGAATTGTCCACAACAAACATCCCGTCGGTCTGAATACCT 
GCGTTTTTCAGCGCGGCATCCACTTTGAATTGAGTCTGTTCGTTGATGATGATGGCTTGT 
CGGTTGGTCGATACCTGCACGGCGGGTGTTTCGCCGAATAGGTTGGGCAGCGAATACACT 
GCGGCAACCGCAATCGTGAACACAATCAGCAGATATTTCCATAAAGGATAACGGTTCATC 
35 ATTGTTCCTTAATGGTTGGAACCCCACCCTTTCGGTGGTGTCGGAATCGGGCTATTTCAG 
AAGAGGCATWiACCCTTCCCAGCCAGGCAAGACCGGAAAGCGGCATCCTGAATATGCCGC 
CCTGCGTGTCGGAACATGGTCAAGCCTTCGGTTGGAATTCAAAACAAAGTGCCGCATTCG 
GGCTTTCCAGATGCGGCTTGTCGGCACAAATCAATCGACTTTTGCGGCAATCGCATTGCG 
TTCCACTTCGACCTCGATTTTTGTACCCTGTCCGATATCCACGGTAAAAAACTGTTCGCC 
40 GACTCTGGTTACCTTACCCTTGAAACCTGCCGCCAAGACCACTTTGTCGCCGACTTTCAA 
GGCGGCAAGCATTGCCTGATGCGCTTTGAATTTCTTTTGCTGCGGACGCATGATCAGGAA 
GTAGAACACCACCATAATCAACACTAAAGGAGCAAATTGTGCAACAGCTTGATTCATAAT 
TTATCCGTTCTTTCTAATATGGTTGAAAATCGAGAGGGGTATATAATAACATAAGACCGT 
AAACAATATATCGGGTTTGCCGTCCGTACCGACCGTATACCGCAGCCTGCCCGTCCACAA 
45 ACCCATGTCCTTGACCCGCCTAATCTTGAAATTCTATGCACTGTTGCGCCTTTTTTTGGG 
CAAAAACGCCCGCACCGCATGGATTTCGCATCCCGCCTGTGCCGGGCACGAACCCGGCGC 
AAACCATCCCGATTCGCCCGACCGCATCCTCTGCATCGAGCAGGCATTGCGCCGCGCCGG 
TATTTGGCAGCACCTCCAAACCATAGAGGCGGAAGAAATCAGCGATACGCGCCTCGCACT 
TGTCCACTCGAGCAAATATCTGAACCGTTTGGAATCTTGCCTGCCCCA7\AAJ\.GGCAAGAT 
50 TTCCCGCCTGGATAACGACACTGCAATCAGCACAGGATCGCTGTCTGCCGCACGCTTTGC 
CGCCGGTTCGGCAGTTCAGGCAGTCGACATGGTCATGAACCGTAAAGCATGGCATGCCTT 
TTGCGCCGCCCGCCCGCCCGGACACCATGCGGGCAGCGGCAAAGCCGGCGGATTCTGCCT 
GCTGAACAACGTTGCCGCCGGCGTCATGCATGCCATTGCCGAATACCGCCTGAAACGCAT 
TGCCGTCATCGATTTCGATGTCCACTACGGCGACGGTACGGCAGAAATATTCAAAGACGA 
55 TCCGCGCATCCTGTTTTTCAACCTGTTTGAAACCGACCTTTTCCCCTTCCCCGAAAACAA 
CGATATGCCCGACGGCGGCAATATGGTGCACCTGCCCTTGCCGCCAGGAACGGGCAGCCG 
CACATTCCGCGAAGCCGTCCGCAGGCAGTGGCTACCCCGACTTGCCGCATTCAAACCCGA 
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ACTGGTGCTGCTGTCGGCAGGATTCGACGCACACCGTCTAGACGAATCGGGCAGGCTCAA 
CCTGCACGAGGCGGATTTTGCCTGGCTGACACACAAAATCATTCAGACGGCATCGGGCTG 
CCCCGGCAAAATCATATCCGTACTCGAAGGCGGCTACACCCTTGAACCTCTGGCACAATC 
TGCTGCCGAGCATATCCGCGTGCTGGCAGGGCTGGGCAAATCCGATGCCGCAACCGCCTA 
5 TCAGAAAACACTGGACCCAACAAAAAAACGGTTTGCCAAACCGAAAACCGGACAGGTGCG 
CCAACCTACCCAATCCGACCGTTACGACATATAGTGGATTAACAAAAATCAGGACAAGGC 
GACGAAGCCGCAGACAGTACAGATAGTACGGAACCGATTCACTTGGTGCTTGAGCACCTT 
AGAGAATCGTTCTCTTTGAGCTAAGGCGAGGCAACACCGTACTGGTTTTTGTTAATCCAC 
TATAAATGCCGTCTGAAGCCGATTCGGTTTTCAGACGACATCATTATTTGACCTGACGCA 
10 AACAGAAAGCCTTCCGCGTGCTTAAACACATCTCTTGAGACCTTTGCAAAAATAGTCTGT 
TAACGAAATTTGACGCATAAAAATGCGCCAAAAAATXTTCAATTGCCTAAAACCTTCCTA 
ATATTGAGCAAAAAGTAGGAAAAATCAGAAAAGTTTTGCATTTTGAAAATGAGATTGAGC 
ATAAAATTTTAGTAACCTATGTTATTGCAAAGGTCTCCTCTTGTAGTAAAATATCTAAAC 
AGATTCAACTGGAAACGGTCAAGCTTCTTCACGCCATTCCCATTTCAACCGCCGCCTGCC 
1 5 AGATTCAGACGGTATTCCTACTTGAAAACAAGGAGCGTTACAATGAAAGCAGCACGTTTT 
TACGACAAAGGCGACATCCGCATCGAAGACATCCCCGAACCGACCGTCGCCCCCGGCACT 
GTCGGCATCAATGTCGCCTGGTGCGGCATCTGCGGTACTGACCTGCACGAATTCATGGAA 
GGCCCGATTTTCATTCCGCCTTGCGGTCATCCGCACCCGATTTCCGGCGAGTCCGCACCC 
GTAACGATGGGACACGAGTTCTCCGGCGTGGTCTATGCCGTCGGCGAAGGCGTGGACGAC 

20 ATCAAAGTCGGCCAACACGTCGTGGTCGAACCCTACATCATCCGCGATGACGTACCGACC 
GGAGAAGGCAGCAACTACCACCTCTCCAAAGATATGAACTTTATCGGCTTGGGCGGCTGC 
GGCGGCGGTCTGTCCGAAAAAATCGCCGTCAAACGCCGTTGGGTGCATCCGATTTCCGAC 
AAAATCCCGTTGGATCAAGCCGCTTTGATCGAACCGCTGTCTGTCGGACACCACGCCTAT 
GTACGCAGCGGCGCGAAAGAAGGCGACGTCGCATTGGTCGGCGGTGCAGGTCCGATCGGT 

25 TTGCTGTTGGCTGCCGTGTTGAAAGCCAAAGGCATCAAAGTCATCATCACCGAGTTGAGT 
AAAGCACGCAAAGACAAAGCGCGCGAATCCGGCGTTGCCGACTACATCCTCGACCCGTCC 
GAAGTCGATGTTGTTGCAGAAGTGAAAAAACTGACCAACGGCGAAGGCGTGGACGTGGCA 
TTTGAGTGCACCAGCGTCAACAAAGTGTTGGATACTTTGGTCGAAGCCTGCAAACCTGCC 
GCCAATTTGGTTATCGTATCCATCTGGAGCCACCCCGCCACCATCAACGTCCACAGCGTC 

30 GTGATGAAAGAGTTGGACGTGCGCGGCACGATTGCCTACTGCAACGACCACGCCGAAACC 
ATCAAACTGGTCGAAGAAGGCAAAATCAACCTTGAGCCTTTCATCACCCAGCGCATCAAG 
CTGGACGAGCTGGTTTCCAAAGGCTTCGAGCGTCTGATTCACAACAACGAATCCGCCGTT 
AAAATTATTGTGAGTCCAAACCTGTAAGCAAATTACATTTGCAACAGCACAAATGCCGTC 
TGAAACGCTCAGACGGCATTTTCCAACGAAGCGCAACAGGCGTATTGCCCGAGAGCAGCC 

35 GTTTAGCCTATCATTCCGAACAAACAGAAACAAACAGAAGGAAAATCATGGGAGATTCCG 
TACTATCCGCCATCCAACAAACCATCACCCAGCGAAAATCTGCCAATCCGTCCGAATCTT 
ACGTCGCACAGCTCTTGCATAAGGGCGAAGACAAAATCCTAAAAAAAGTGATTGAAGAGG 
CGGGCGAAGTGTTGATGGCATCCAAAGACAAAAACCCGTCCCACCTGGTTTACGAAGTTG 
CCGACTTATGGTTTCACACCATGATTCTTCTGACACACCACGACCTGAAGGCGGAAGACG 

40 TATTGGACGAACTTGCGCGCCGTCAGGGGCTGTCGGGGCTGGTCGAAAAAGCCGCTCGCA 
CAGAATCTTGAATTTATATTAAAATCCGCACTTTCCCACATTCAATCCGTCTGACCGCTT 
TTCAGACGGCATCGGAGCCGTTATGGACAACTGTATTTTCTGCAAAATCGCCGCCAAAGA 
CATTCCGGCGCAAACCGTCTATGAAGACGGCGAAATGGTTTGTTTCAAAGACATCAACCC 
CGCTGCTCCGGTTCATCTGCTGCTGATTCCCAAAGTCCATTTCGATTCGTTGGCACACGC 

45 CGCGCCCGAACATCAGATGCTGCTGGGCAAAATGATGTTGAAAGTTCCCGAAATCGCCAA 
AGCGGCAGGACTGGCAGACGGCTTCAAAACCCTGATCAATACCGGAAAAGGCGGCGGACA 
AGAGGTCTTCCACCTGCATATACACATCATGGGTACACCCGTATAAACCGTTATTTCACA 
ATCAACCCCTAATACTTACTTAAGGATACATCATGGGCAGTTTTTCTCTGACGCACTGGA 
TTATCGTACTGATTATCGTCGTTTTGATATTCGGCACCAAAAAACTGCGCAACGTCGGCA 

50 AAGACCTCGGCGGTGCGGTTCATGACTTCAAACAGGGGCTGAACG/IAGGTACAGACGGCA 
AAGAAGCCCAAAAAGACGATGTAATCGAACACAAAAAAGACGAAGACAAAGCGTAATTTA 
TGTTTGATTTCGGTTTGGGCGAGCTGGTTTTTGTCGGCATTATCGCCCTGATTGTCCTCG 
GCCCCGAACGCCTGCCCGAGGCCGCCCGCACCGCCGGACGGCTCATCGGCAGGCTGCAAC 
GCTTTGTCGGCAGCGTCAAACAGGAATTTGACACTCAAATCGAACTGGAAGAACTGAGGA 

55 AGGCAAAGCAGGAATTTGAAGCTGCCGCCGCTCAGGTTCGAGACAGCCTCAAAGAAACCG 
GTACGGATATGGAAGGCAATCTGCACGACATTTCCGACGGTCTGAAGCCTTGGGAAAAAC 
TGCCCGAACAGCGGACACCTGCCGATTTCGGTGTCGATGAAAACGGCAATCCGCTTCCCG 
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ATGCGGCAAACACCCTATCAGACGGCATTTCCGACGTTATGCCGTCCGAACGTTCCTACG 
CTTCCGCCGAAACCCTTGGGGACAGCGGGCAAACCGGCAGTACAGCCGAACCCGCGGAAA 
CCGACCAAGACCGCGCATGGCGGGAATACGTGACTGCTTCTGCCGCCGCACCCGTCGTAC 
AGACCGTCGAAGTCAGCTATATCGATACTGCTGTTGAAACGCCTGTTCCGCACACCACTT 
5 CCCTGCGCAAACAGGCAATAAGCCGCAAACGCGATTTTCGTCCGAAACACCGCGCCAAAC 
CTAAATTGCGCGTCCGTAAATCATAAAGAGAGCAATCCGGTGTCCGAAACACAAAACGAA 
CAACCCGTCCAACCGCTTGTCGAGCATCTCATCGAGCTGCGCCGCCGCCTGATGTGGACG 
GTTGTCGGCATCTTAGTCTGCTTTTTCGGCCTAATGCCGTTTGCCCAACAACTCTATACT 
TTTATCGCCGACCCGCTGATGGCAAACCTGCCCAAAGACACCAGCATGATTGCCACCGAT 
10 GTCATCGCACCATTTTTCGTGCCGGTCAAAGTTACCCTGATGGCGGCATTTTTAATTTCG 
CTGCCGCATACGCTCTACCAAATCTGGGCATTTGTCGCGCCCGCACTCTACCAAAACGAA 
AAACGCCTGATTACGCCGCTCGTCCTCTCCAGCGTCAGCCTGTTTTTCATCGGCATGGCA 
TTTGCCTACTTTTTGGTTTTCCCCGTCATTTTCAAATTCCTTGCCAGCGTTACCCCTGTC 
GGTGTCAATATGGCGACAGACATCGACAAATACCTCTCCTTCATCTTGGGGATGTTTGTT 
15 GCGTTCGGCACAACGTTTGAAGTCCCCATTGTCGTTATCCTGTTAACCAAAATTGGTGTG 
GTAACAACCGGACAGCTCAAACGCGCCCGCCCCTATGTGATTGTCGGCGCGTTTGTCATT 
GCCGCCATCATCACGCCGCCCGATGTGATTTCACAAACCCTGCTTGCCATTCCGCTGATT 
CTCTTATACGAAGCAGGTATTTGGTTCGGACGCTTTTTCACGCCACGTTCAGAACAGGAT 
GGCGACATACGGCCGCCTGCAACAACCTGACACTATGCCGTCCGAACCTCCGCCTCATAC 

20 CGCAACAGATTAAGGAATACCTTTGAATACCCTCTATTTAGGTTCAAACAGCCCGCGCCG 
AATGGAAATCCTGACACAGTTGGGCTACCGCGTCATCCAACTGCCTGCCGGCATCGACGA 
ATCCGTTAAAGCCGGCGAAACACCTTTCGCTTACGTTCAAAGGATGGCAGAAGAAAA7W5L 
CCGAACCGCCCTGACCCTCTTTTGCGAAACCAACGGCACAATGCCCGATTTCCCCCTGAT 
TACCGCCGACACCTGCGTCGTTTCAGACGGCATCATATTGGGCAAACCCCGCTCCCAAGC 

25 CGAAGCAATCGAATTTTTAAACCGATTGTCCGGCAAACAACATACCGTCCTGACTGCTGT 
CTGCATTCATTATCGCGGCAAAACGTCAAGCCGCGTCCAAACCAACCGCGTCGTTTTCAA 
GCCCCTGAGTTCGGAAGAAATTTCCGCCTATGTGCAAAGCGGCGAGCCGATGGACAAAGC 
CGGTGCCTACGCCGTACAAGGCATAGGCGGCATCTTTATCCAATCTATCGAAGGCAGCTT 
CAGCGGCATTATGGGACTGCCCGTTTATGAAACCGTTTCGATGTTGCAGGATTTGGGATA 

30 CCGCTCCCCCTTGTCCGCCCTTAAACCGTAAAGACCGCCGTGAACAGACAAACCGCTTAC 
CTCCTTGCCTCTTTCAGCCTGATCGCACTGATAATCCTGTCCCTTTCCTGGGAACTGTGG 
ATTGCACCGTTGCGCCCGGGCGGCTCGTGGCTCGCGCTCAAAGCCCTCCCCCTCTGTCTG 
CCGCTTTCAGGCATCTTGAAAAAGAAAATCTATACTTACCAATACAGCTCCATGCTGGTT 
CTGATTTACTTTGCCGAAGCCGTCATGCGCCTGTTCGACGCCTATCCCGCAGAACAGATT 

35 TGCGCCGCGCTTTCCGCAGTATTCAGCATCATCTTCTTCATATCCTGCCTGTCGTTCGTC 
AAACAATACAAGGAAACAAACAATGCCCGCTGAAACGACCGTATCCGGCGCGCACCCCGC 
CGCCAAACTGCCGATTTACATCCTGCCCTGCTTCCTTTGGATAGGCATCGTCCCCTTTAC 
CTTCGCGCTCAAACTGAAACCGTCGCCCGACTTTTACCACGATGCCGCCGCCGCAGCCGG 
CCTGATTGTCCTGTTGTTCCTCACGGCAGGAAAAAAACTGTTTGATGTCAAAATCCCCGC 

40 CATCAGCTTCCTTCTGTTTGCAATGGCGGCGTTTTGGTATCTTCAGGCACGCCTGATGAA 
CCTGATTTACCCCGGTATGAACGACATCGTCTCTTGGATTTTCATCTTGCTCGCCGTCAG 
CGCGTGGGCCTGCCGGAGCTTGGTCGCACACTTCGGACAAGAACGCATCGTGACCCTGTT 
TGCCTGGTCGCTGCTTATCGGCTCCCTGCTTCAATCCTGCATCGTCGTCATCCAGTTTGC 
CGGCTGGGAAGACACCCCTCTGTTTCAAAACATCATCGTTTACAGCGGGCAAGGCGTAAT 

45 CGGACACATCGGGCAGCGCAACAACCTCGGACACTACCTCATGTGGGGCATACTCGCCGC 
CGCCTACCTCAACGGACAACGAAAAATCCCCGCCGCCCTCGGCGTAATCTGCCTGATTAT 
GCAGACCGCCGTTTTAGGTTTGGTCAACTCGCGCACCATCTTGACCTACATAGCCGCCAT 
CGCCCTCATCCTTCCCTTCTGGTATTTCCGTTCGGACAAATCCAACAGGCGGACGATGCT 
CGGCATAGCCGCAGCCGTATTCCTTACCGCGCTGTTCCAATTTTCCATGAACACCATTCT 

50 GGAAACCTTTACTGGCATCCGCTACGAAACTGCCGTCGAACGCGTCGCCAACGGCGGTTT 
CACAGACTTGCCGCGCCAAATCGAATGGAATAAAGCCCTTGCCGCCTTCCAGTCCGCCCC 
GATATTCGGGCACGGCTGGAACAGTTTTGCCCAACAAACCTTCCTCATCAATGCCGAACA 
GCACAACATATACGACAACCTCCTCAGCAACTTGTTCACCCATTCCCACAACATCGTCCT 
CCAACTCCTTGCAGAGATGGGAATCAGCGGCACGCTTCTGGTTGCCGCAACCCTGCTGAC 

55 GGGCATTGCCGGGCTGCTTTiAACGCCCCCTGACCCCCGCATCGCTTTTCCTAATCTGCAC 
GCTTGCCGTCAGTATGTGCCACAGTATGCTCGAATATCCTTTGTGGTATGTCTATTTCCT 
CATCCCTTTCGGACTGATGCTCTTCCTGTCCCCCGCAGAGGCTTCAGACGGCATCGCCTT 
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CAflAAAAGCCGCCAATCTCGGCATACTGACCGCCTCCGCCGCCATATTCGCAGGATTGCT 
GCACTTGGACTGGACATACACCCGGCTGGTTAACGCCTTTTCCCCCGCCACTGACGACAG 
TGCCAAAACCCTCAACCGGAAAATCAACGAGTTGCGCTATATTTCCGCAAACAGTCCGAT 
GCTGTCCTTTTATGCCGACTTCTCCCTCGTAAACTTCGCCCTGCCGGAATACCCCGAAAC 
5 CCAGACTTGGGCGGAAGAAGCAACCCTCAAATCACTAAAATACCGCCCCCACTCCGCCAC 
CTACCGCATCGCCCTCTACCTGATGCGGCAAGGCAAAGTTGCAGAAGCAAAACAATGGAT 
GCGGGCGACACAGTCCTATTACCCgTACCTGATGCCCCGATACGCCGACGAAATCCGCAA 
ACTGCCCGTATGGGCGCCGCTGCTACCCGAACTGCTCAAAGACTGCAAAGCCTTCGCCGC 
CGCGCCCGGTCATCCGGAAGCAAAACCCTGCAAATGACCCGCGCCGGCGGATGCGGATAC 
10 CGCCCGAAATGTAAAGCTCCATGCAAGACATTGCAAAAAACAGCAAACCGGTAGGGAAAT 
ACGCTATCAAAAAACATTGCAGCCGTGTTAAGATAAACCGTCAAACAATCTTTTCACaGC 
CCCGCCCGAAACAGGGTCGGGGCATACCCTTACGAAAAGGAAACACCATGAGCCGCGTAT 
TACTCGTAGATGACGATGCCCTGCTAACCGAACTGCTGACCGAATACCTGAGCGCCGAAG 
GTCTGAACGTCCGCAGCGTTCCCGACGGGGAAGCAGGCGTACAGGAAATCCTGAGCGGGC 
1 5 AGTACGATGTAGTCGTATTGGATTCCATGATGCCCAAAATGAACGGCTTGGATGTCTTGA 
AAAACGTACGCGCCCGAAGCACCGTCCCCATCATCATGCTGACCGCCAAAGGCGACGACA 
TCGACCGAATCATCGGCTTGGAAATGGGCGCGGACGACTATGTCCCCAAACCCTGCACAC 
CACGCGAACTCTTGGCACGCATCAATGCCATCCTGCGCCGCGCACAACACAGCGGCGAAC 
AGAACAACGCACCCAACAGCATCTCCGTCAGCGATGTCGTCCTGTACCCCGCCAAACGCC 
20 AGGCATCCGTCAAAGACATGCCGCTCGAACTGACCAGCACCGAATTCAACCTGCTCGAAG 
TCCTGATGCGCCATGCCGGACAGGTAGTCAGCAAAGAAACCCTGTCCGTCGAAGCACTCG 
ACCGCAAGCTGGCAAAATTCGACCGCAGTATCGACGTACACATCTCCAGCATCCGCCACA 
AGTTGGGCGATGCCTCTCTGATTCAAACCGTACGCGGCTTGGGCTACCTGTTTGTCAAAA 
ACTGAAATAAACAGATAAATGAAACTGTTCCAACGCATTTTCGCCACATTTTGCGCGGTT 
25 ATCGTCTGTGCAATCTTTGTGGCGAGTTTTTCTTTCTGGCTGGTGCAGAACACCCTTGCC 
GAAAACCAGTTCAACCAACGCCGCACCATCGAAACCACTTTGATGGGCAGCATCATTTCC 
GCATTCCGGGCACGCGGGGACGCGGGTGCGCGCGAAATCCTGACGGAATGGAAAGACAGC 
CCCGTCTCATCGGGCGTGTACGTTATACAGGGCGACGAGAAAAAAGATATCCTGAACCGG 
TATATCGACAGCTATACCATCGAACGCGCCCGGCTTTTCGCCGCCGGACACCCGCATTCC 
30 AACCTCGTCCATATCGAATACGACCGCTTCGGCGAAGAATACCTGTTCTTCACCAAAGAC 
TGGGACAAACTCCAAGCCCGCCGCCTGCCCAGCCCCCTGTTGATCCCCGGCCTGCCGCTC 
GCCCCGATTTGGCACGAACTCATCATATTGTCCTTCATCATCATCGTCGGACTGCTGATG 
GCATATATCCTCGCCGGCAACATTGCCAAACCCATCAG7u\TCTTAGGCAACGGCATGGAC 
AGGGTGGCAAACGGAGAACTTGAAACCCGTATCTCCCAACAGGTCGACGACCGCGACGAC 
35 GAATTGTCCCATCTTGCCATCCAATTCGACAAAATGGTGGAAAAACTCGAAAAACTCGTT 
GCCAAAGAACGCCACCTGCTCCATCACGTCTCCCATGAAATGCGTTCTCCCCTTGCGCGC 
ATGCAGGCAATTGTCGGACTGATTCAGGCGCAGCCCCAAAAACAGGAGCAATATCTCAAA 
CGGCTGGAAGGCGAACTGACCCGCATGGATACGCTGGCCGGGGAACTGTTAACCCTGTCC 
CGTCTCGAAACTTCCAATATGGCTTTGGAAAAAGAAAGCCTGAAACTCCTGCCCTTCCTG 
40 GGCAACCTGGTAGAAGACAATCAAAGCATTGCCCAGAAAAACGGACAAACGGTTACCCTG 
TCTGCCGACGGAAAAATCCCCGAAAACACAACCATCCTTGCCAACGAAAGCTACCTGTAC 
CGCGCCTTCGACAACGTCATCCGCAACGCCGTCAACTACAGTCCCGAAGGCAGCACCATC 
CTGATCAACATCGGACAAGACCACAAACACTGGATAATCGACGTTACCGACAACGGCCCC 
GGCGTGGACGAAATGCAGCTCCCGCACATCTTCACCGCTTTCTACCGTGCAGACTCCAGT 
45 GCCAACAAACCCGGAACAGGACTGGGGCTTGCATTGACCCAACATATTATTGAACAGCAC 
TGCGGCAAAATCATCGCCGAAAACATCAAACCGAACGGTCTGCGGATGCGCTTTATCCTG 
CCCAAGAAAAAAACCGGTTCCAAAACAGAAAAAAGTGCGAACTGACCATAATACCGTCTG 
AACCGGCTTCAGACGGCATTGCACAAACAGTTACCCCAAAACAAACACACAGCGGGTGTC 
TCTCCAAAAATCACGCACATACCGTCCGCAAAGGAATATATCATGTCGGCACAAACCGAT 
50 CCGGGCTACTTCTTCATGCCCAACCACATCATCCTGATAGGCGCGAGCGAACAACCGTAC 
AGCCTGGGCGAACGTGTACTCAGCAACCTGCTGAGTACGCCCTTTCAAGGAAAAATCACC 
CCCGTAAACCCGCGCCACCACACCATAGCCGGACTGCCCGCCTACACCAGCCTCAACAAA 
ATCCCCGGCAATGCAGACCTGATTATTGCCGTTACCCTACCCGACAGTTACGACACCCTC 
TTCAAAACCTGCCGCAAAAAGCAGCTCCGACACATCATCCTCATACAGGACTGGGACAAC 
55 CTGTCTGCCGCAGAACTGCACACCGCCGAAACTGCCATCCGCAAACACCACGGCAACGGA 
CTCAACATCACCGCCTGCACCACCGCAGGCATCCAACTGCCCTCACTCGGACTCAACATC 
AGTACCCAAGACGGATACGCCGCAGGCCATACCGCCATACTGACCGGCAATGCCGCCGTC 
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AGCCGCCAAATCGACAACATCCTGAACAAACTCCGTCAAGGCACATCCCGCCACATCAGC 
CTGCATCCCGGCATCAGCCCCATCACATCCGCCGATTGGCTCAACCGCTTCGGACACAGC 
CTGCACACC7\AAACCGCCGTCCTACACCACAACCCTGAAGAGGATCAGCGCAAACTGTTC 
AGCGCAATCCGCCAATTTACCCGCCATACGCCGCTGATTCTCCACATCACCTGCCGCACG 
5 ACAGAAACCGACCGTGCCGTACTGCACTGCCTCGCCCGACACTGCAACTTCCTCGTCAGT 
TTCAACGCCGACGACCTCGAAGCCGCACTGCGCGCCCAACTGTCCGACCTTCCACCCCTG 
TCCCGACTCGACATCCTGTCCGACACGCCTGCCGAATGGCTGCACGCGCACGCGCCAAAA 
AACCTCACCCTCCACTTTCCCAACCTTCCCCACCACATCCGCAACGGACACCTGACCGGC 
ACACCCACACCCTCAATCTGCCACGACATCGCCTCACGTCAGCTTGCCCACCCCGACACC 
1 0 CAAGCCGTCCTAACCATCCTCAGTCCCTCCGGACACGAGGATTACAAAAAAACAGCACGC 
GCCCTTATCCGCCTGTCCGAACAGACCGCCAAACCCCTGCTCGTCAGCAGCCCCTTTTCA 
GACGGCATAACCCATTTCGACACCCCCACTCAGGCAATCCGCACCCTTTCCTACCGCAAC 
ACCGCCGCCGCCCTGAAACAGGCACAGCTCGACATTGCACCGCCGCAGCCATGCCGTCTG 
AAAACACCGCAACCCCAAAACATCAAAAAAGCCCTTGCAGCGGCAAACCCCTCCCTGCTC 
1 5 GCCGAAGCCCTGCACCTCCCCCCCTACCGGCACACCACCCATAACGCCGTACAATTCCAA 
TTCGACAGCCACCCCCTCTATGGCGACATCCTGACCGCACGCTGCAACGGACAAACCACT 
GCCGTACTCCCGCCGTTTACCACGCTCGACAGCCGCCACCTTGCCCGCTTTGCCGAACTC 
GACGGCACACAAACCCTCGACCAGTTCCTGCACACACTGACCGTCATTCCCGAATACCGC 
CAACACATTCTCGGCATCACCCTCAACCTCAACGGCGGACAATACAGCAGCGATTTCCTC 
20 TTAAAGACACCCGAAACACACGACACGCCCAAACGCAAGAACACAGGCAAAGCCGCCCAA 
ACCCTCGAACATGCCGCCGCAAAAATGCAGAGTGCCGCCGCATACCTGAAACACAAAAAC 
CCGACAGCCGCCGAATTTCTCCGCCACACAAGCGAAGCCGCCGCAGAACTGCTCGGCAGC 
AAAACCGAAACCGGAGCAGCCGTACCCAACGTACTTGCCCCCTATCCCGCAGCATACCCC 
AAAACACTGTCCCTAAGAAACAACACGACCGTTACCATTACCCCCATTTTGCCCGAAGAC 
25 GCAGAAGCCAAACAGCAGTTCGTCCGCAGCCTCGGTCCCGAAGCACGGTACACACGCTTC 
ATGACCCACACCAACGAACTGCCCGCAGCCACGTTGGCACGCCTGTGCAACCCCGATTAC 
CACTGTGAAGCCGCATGGACGGCAAAGGATGCCGACAGCAACATCGTCGCCGTCGTCCGC 
CACAGCCGCCTGAATCGCAACGAATGCGAATTCGGCATCACACTGGCGGAACATATGCGC 
GGCAGCGGACTGGCACAGAAAATGATGGAACTCATCATCCAAACCGCCGCACAGCAAGGC 
30 TACCGGACTATGAGTGCCGACATTCTCAAAACCAATACCCCCATGATCAAACTTGCTGAA 
AAATCAGGATTTACCCTCAAGGAATCGGACACCGAAAAAAACCTGTACCGCGCATATCTG 
AACCTTGCGGCAGACAAAACAACAGAAAAAACAAATAAAAACTTGCGCACCGACCACAAA 
ATAACCTAAAATCGGCAGTTTCCATATATCCGATTTTCTCAAAAGGACTCAAAATGGTAG 
TTATCCGTTTGGCACGCGGCGGCTCGAAACACCGCCCCTTCTACAACGTCATCGTTACTG 
35 ACTCACGCAGCCGCCGCGACGGCCGCTTCATCGAACGCGTAGGCTTCTACAACCCCGTAG 
CCAACGAAAAACAAGAGCGCGTCCGCCTCAATGCAGACCGCCTGAACCACTGGATTGCAC 
AAGGCGCGCAAGTCAGCGACTCCGTTGCAAAACTGATTAAAGAACAAAAAGCCGCCTAAT 
CCGCATTTGCCGCCATGACAGACACTCAAAACCGGGTAGCCATGGGCTACATCAAAGGCG 
TATTCGGCATAAAAGGCTGGTTGAAAATTGCCGCCAACACCGAATATTCCGACAGCCTTT 
40 TGGACTACCCCGAGTGGCATTTGGTCAAGGACGGCAAAACCATCAGCGTTACCCTTGAAG 
CCGGAAAAGTCGTCAACGGCGAACTCCAAGTCAAATTCGAAGGCATAAACGACCGCGACT 
TGGCATTCTCATTGCGCGGTTACACCATCGAAATACCCCGTGAAGCATTCGCCCCGACAG 
AAGAAGACGAATACTACTGGACAGACTTGGTCGGCATGACCGTTGTCAACAAAGACCATA 
CCGTTTTAGGCAAGGTAAGCAACCTGATGGAAACCGGCGCAAACGACGTATTGATGATTG 
45 ACGGAGAACACGGGCAGATTCTGATTCCGTTCGTTTCCCAATATATCGAAACCGTCGATA 
CCGGCAGCAAGACCATTACTGCCGACTGGGGTTTGGACTACTGATGCTTATCCAGGCAGT 
TACCATTTTCCCCGAAATGTTCGACAGCATTACCCGCTACGGCGTAACGGGACGCGCGAA 
CAGACAGGGAATCTGGCAGTTTGAAGCAGTCAATCCCCGAAAGTTTGCCGACAACAGATT 
GGGCTACATCGACGACCGCCCGTTCGGCGGCGGCCCGGGAATGATTATGATGGCTCCGCC 
50 GCTTCATGCGGCAATAGAACACGCCAAAACACAATCCTCCCAAGCTGCAAAAGTCATCTA 
CCTCAGCCCCCAAGGGAAACCGCTTGACACACCAAAAAGCGGTAGAACTGGCAGAACTTC 
CGCATCTGATTCTGCTGTGCGGACGGTATGAGGGCATAGACGAAAGGCTTCTGCAAAGCA 
GCGTCGATGAAGAAATCAGCATCGGAGACTTCGTTGTTTCCGGCGGAGAGCTTCCCGCCA 
TGATGCTGATGGATGCGGTATTGAGGCTCGTACCCGGCGTATTGGGCGATATGCAGTCTG 
55 CCGAACAGGATTCGTTCTCAAGCGGCATTTTGGACTGCCCCCACTACACCAAACCCTTAG 
AATTTCAAGGTATGGCTGTTCCGGAAGTATTGCGTTCCGGCAATCATGGCTTGATAGCGG 
AATGGCGGTTGGAACAATCGCTGCGCCGCACCTTGGAGCGCAGACCCGATCTTTTGGAAA 
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AGCGCCTTTTAPlTCCCAAAGGAATCCCGCCTCTTAGAAACCATCCGGCAAGAGCAACGGG 
AAATCCAATCATAATTTAGGAAAAAAACAATGAACCTGATTCAACAGCTCGAGCAAGAAG 
AAATTGCCCGCCTGAATAAAGAAATCCCCGAATTCGCACCGGGCGACACCGTAGTCGTAT 
CCGTACGCGTCGTGGAAGGTACCCGCAGCCGTCTGCAAGCCTACGAAGGCGTGGTTATTG 
5 CCCGTCGCAACCGTGGTCTGAACAGCAACTTCATCGTCCGCAAAATCTCCAGCGGCGAAG 
GTGTTGAACGTACTTTCCAACTGTACTCTCCGACCGTCGAAAAAATCGAAGTCAAACGCC 
GTGGCGACGTACGCCGTGCCAAACTGTACTACCTGCGCGGCCTGACCGGTAAAGCTGCAC 
GCATCAAAGAAAAACTGCCTGCACGCAAAGGTTGATTCAAACCGTTTTCCCCCAATGCCG 
TCTGAACCTTCAGACGGCATTTCTTATTGCTGCGATGCCGGCAAGTCAACGATATGCCTG 
1 0 CCCCTCATCCACCTTGATTAACAATCACTCCCGCCTATTCAAGCATAATTTACAGACCAA 
ACGTTATACAGTATCATATCAGCCATATAACGATAAACGCTTTACCGCCTTCCCTTTCAA 
AACCGCAAATCATCATGACACCTTCCCTTTTACTATCAGGATTGACCTTCCGCCTCATCC 
TTGCCCTGATTGCCGTATCCCTTTTATGGGGCGTTTACTTTTGGGCAGTATCCGCATGAG 
CATCATTGTCGAAAACCTGACCGTCAGcTACCGCCGCCGACCTGCCGTGCACCATGTGGA 
1 5 CATTACTTTTGAAGAACATAGTATGTGGGCGGTTTTCGGTCCCAACGGCGCAGGGAAATC 
CACCTTTCTCAAATCCTTGATGGGATTGCAGCCTATCGATACAGGCAGCATCCGGCTGGA 
CGGATTGACCCGTCAGAACATCGCCTACCTTCCCCAGCAGTCCGATATCGACCGCTCCCA 
GCCTATGACCGTTTTCGACTTGGCGGCAATGGGGCTATGGTATGAAATCGGCTTTTTCAA 
AGGGATAAATACCGCTCAAAAACAACGCGTTCACGTiAGCCTTGGAGCGCGTCGGAATGCA 

20 ACGGTTTGCCGACCGTCAGATTGCCTATCTCTCAAACGGACAATTTCAGCGTGTCCTTTT 
TGCCCGAATGCTGGTTCAAAATGCCAAATTCCTGCTGCTCGACGAACCCTTCAATGCCGT 
TGATGCACGGACAACCTACGAGCTTCTCGACGTATTGCAGAAATGCCATTGCGGCGGACA 
CGCCATCATCGCCGTACTGCACGATTACGAACAAGTCCGTGCCTACTTTCCCAATACCCT 
GCTGCTCGCCCGCGAAAAAATTGCGGCAGGCGCAACCGAGACCATTCTGACAGAACCCTA 

25 CCTCGCCCAAGCCAACGCCAAAATGCAGCAACAGGAAAGCCCCGACTGGTGCGCCTCATA 
AATGCCGTCTGAAACCGAAAAaCCATGAATCTCTACGACCTGCTCCTTGCCCCCTTTGCA 
GAATTCGACTTTATGCGCTACGCCCTCGCATCCGTCTTCTGCCTGTCCCTCAGTGCCGCA 
CCCGTCGGCGTATTCCTCGTCATGCGCCGTATGAGCCTGATAGGCGACGCATTGAGCCAC 
GCCGTCCTGCCCGGTGCCGCCGTCGGCTACATGTTTGCCGGCTTGAGCCTGCCCGCCATG 

30 GGTTTGGGCGGCGTAGCCGCAGGCATGCTGATGGCACTGCTTGCCGGACTCGTCAGCCGC 
TTCACCACCCTGAAAGAAGATGCCAACTTTGCCGCCTTTTATCTCAGCAGCCTCGCCATC 
GGCGTAGTCCTCGTCAGCAAAAACGGGAGCAGCGTCGATTTGCTCCACCTCCTTTTCGGC 
TCTGTACTTGCCGTCGATATTCCTGCCCTGCAGCTCATCGCCGCCGTCTCCAGCCTCACG 
CTCATTACCCTTGCCGTCATCTACCGCCCGCTCGTACTCGAAAGCATCGACCCCCTGTTT 

35 CTCAAATCCGTCGGCGGCAAAGGCGGGCTTTGGCACGTCCTCTTTCTCGTCCTGGTCGTC 
ATGAACCTCGTATCCGGCTTTCAAGCCCTCGGCACACTCATGTCCGTCGGACTCATGATG 
CTGCCAGCCATTACCGCCCGCCTGTGGGCGAAGCATATGGGCGCACTCATCCTCCTATCC 
GTTCTGACAGCCCTGCTGTGCGGCTTGAGCGGACTGCTCATTTCCTACCACATCGAAATT 
CCTTCCGGTCCCGCCATCATCCTCTGTTGCAGCGTCCTTTATCTCTTTTCCGTCATACTC 

40 GGCAAAGAAGGCGGCATTCTGACCGG 



The following partial DNA sequence was identified in N. meningitidis <SEQ ID 6>: 
gnm_6 

GTTATGATATGTTTCGATATGTAAACATTTATAGGTTTGGAGCGATAAACAATGAATGCA 
45 GTCGTAGTTGCCGTAATCGTTATGCTGGTGCTGTCGCTGTCGCGCGTGCACGTGGTATTG 
AGCCTGACGGTCGGCGCGTTTGTCGGCGGCGCGGTGGCGGGTATGCCGCTGCAAAACATT 
GCCGATGCGGCGGGACAGGTCAGTCAGGCGGGGATTATCCCCGTGTTCAACAAAGGTTTG 
GAAGGCGGTGCGAAGATTGCGCTTTCTTATGCGATGCTCGGCGCGTTTGCAATGGCGATT 
ACCCATTCCGGCCTGCCGCAGCAGCTTGCCGGCGCGGTCGTCCGCAAGCTGAACCGGGGC 
50 GGTATGCCCGACAGCGTGCGTTCGGGCGAGGGCGCGGTCAAATGGCTGCTGCTTTCCATC 
ATCCTTGTGATGGGCATGATGAGTCAGAACATCATCCCCATCCACATTGCCTTTATCCCG 
ATGATTGTTCCGCCGCTGCTTTTGGTGTTCAACCGCCTGAAAATCGACCGCCGCCTGATT 
GCGTGCGTCATCACTTTCGGGCTGGTTACGACTTATATGTTCCTGCCTTACGGCTTCGGC 



PCT/US99/23573 



GCGATTTTTTTGAACGAAATCCTGTTGGGCAACATCCATTCCGCCGCGCCGCAGCTTGAT 
GTGAAAAACATTAACGTGATGGCGGCAATGGCGATTCCCGCGTTGGGAATGCTGGCCGGA 
CTCCTGCTGGCGTTTGTCCATTACCGCAAACCGCGCCTGTACCAAAGCAACAATGCCGAT 
ACGGCGGGCAACGCCGATGCGGCAAACCGTCCGCAGCCGTCCGCCTACCGCAGCCTGGCC 
5 GCCGCCGTCGCCATTGCCGTATGCTTTGCCATCCAGTTGATGTATGAAGACTCGCTGGTG 
TTGGGCGCGATGCTCGGTTTCGCCGTATTTATGATGTTGGGGGTCATTAACCGCGACAAG 
GCAAACGACGTATTCGGCGAAGGTATCAAGATGATGGCGATGGTCGGCTTCATTATGATT 
GCCGCGCAGGGTTTTGCCGCCGTGATGAATGCGACCGGGCATATTCAGCCGCTGGTGGAA 
AGCAGTATGGCGATATTCGGCAACAGCAAAGGTATGGCGGCATTGGCGATGCTGGTGGTG 
10 GGGCTTTTGGTAACGATGGGCATCGGTTCGTCCTTTTCCACTTTGCCGATTATTGCCGCG 
ATTTATGTGCCTTTGTGTGTCGGTTTGGGTTTTTCGCCGCTTGCCACCGTCGCCATTGTC 
GGCACGGCGGGGGCGTTGGGCGATGCCGGTTCGCCTGCGTCCGATTCCACGCTGGGCCCG 
ACGATGGGGCTGAACGCCGACGGGCAGCACGACCACATCCGCGATTCCGTTATCCCGACC 
TTCATCCACTACAACATCCCGCTGCTGATTGCCGGCTGGATTGCCGCGATGGTGCTGTAA 
15 ATGGACGCGGTTCAAGAGTTGGAACGCCGTATTGTCGAACTGGAAATCCAATCCGCGCTT 
CAGGAGGACGTAATCGCCGGCCTGAACGCGATGGTGGCGGAATTGCGGCAGACGCTGGAT 
TTGCAGCAGGCTCAGTTGAGGCTGCTGTATCAAAAAATGCAGGACAGGAATCCCGACGCG 
CAAGAGCCGTATTCCCTGCGCGACGAGATTCCGCCGCATTATTGATGCGCCGCCGTATCC 
GGATTTCCTTAAAAAAGGCTGTGTTTGAATATTCCGCCGATGCAAACCTAAGATATATAG 
20 TGGATTAACAAAAATCAGGACAAGGCGACGAAGCCGCAGACAGTACAAATAGTACGGAAC 
CGATTCACTTGGTGCTTCAGCACCTTAGAGAATCGTTCTCTTTGAGCTAAGGCGAGCCAA 
CGCTGTACTGGTTTTTGTTAATCCACTATAACTTTAATCAGTCGGACAAAATGCCTTTTA 
TCCGATGGAACCGTTTTCCGCCCGAACGGAAATAAGCGGCTTCCCCCGCTGCAAACAGAT 
AAAACCAACCGGGTATTCAAACACAGCCAAAAAAAAGCCGTCCG/LACCCAAAGGGGACAG 
25 ACGGCCAAACACATTACGGGGAAAACGTTTTACTCAATGAGTCTGCGAAACAGACATTGC 
TAAAACAATTGCAATTATAAGGTAAGTTTATTTTTTTTTTTTTTTTTTTGCAAAGAAATG 
TAAATTTTTAAATTTTTAAAACCGTTCCGCCGCATCCGTTTGGATGCCGTCTGAAACGGC 
GGCAGGGTTTATCGCGCTGTGCCGCAAACCGGGCATTCAGGGTTGCGCGGCAGGTCGAAA 
TATTGCCAGCCCCCTTCCAAGGCACGGTAAACCGCCAGCCTGCCGTGCGACGGTTCGCCC 
30 GCATCCAGCAGGATTTTCAGAGCCTCCGCCGCTTGGGTACTGCCGATGATGCCGACCAGC 
GGCGAGAACACGCCGAAGAGAGAACAGATGCCGTCTGAAGCCGATCCGCCGTCAAACAGG 
CAGGCGTAACACGGCGAGTCGGGCAAGTCGGGACGGTACACGGCAAGTTGCCCTTCAAAG 
CGTACCGCCGCCCCTGAAACCAGCGGTGTTTTCGTTTGCACGCAGGCACGGTTGACGGCT 
TGCCGCGTGGCGTAGTTGTCGCAACAGTCTAAAACGATGTCGGCGGCTTGAACCAAACCG 
35 GTCAGGCGGCAGCCGTCGAGTTTTTCGTTGACGGCGCGGACGTTGACGGTATGGTTGATG 
CGTTTCAGGCGGCCTGCCAAGGCTTCGGTTTTGAGTTTGCCGACATCGCCCTCGTCAAAT 
GCGACTTGGCGTTGCAGGTTGTGCAGTTCGACCGTGTCGGAATCGGCTATGGTCAGCGTG 
CCGACACCCGAAGCGGCAAGGTAGGGCAGTGCGGCGGCACCCAAACCGCCGCAGCCGACG 
ACCAAT^TATGCGCGGCGGAAAGTTTCTGCTGCCCTTCGATGCCGATTTCGTCCAAGAGG 
40 ATGTGGCGGCTGTACCGCAGCAGGAATGCATCGTCGTTGTCGTGTTCGGTCGTGGTCATG 
ATGATGTTCGGAAAAAAACAGTTGCGGGCGATTGTAACGCTGCCGTCGGGCGGCGTTCAA 
CTTCAGACGGCATTTCGGGACACGGGCGGTTAAAGTGTGAACGGTTTGGCACGGATGCGG 
CATTTGGGGTACATTTACAATATTTACGGCAGACGAGAGAGAAAAATCATGCAACTGCAT 
ATTCTGAACAATCCAAAGGACGCGGCTTTGGCGGCGGACGCGGAATTTCTGAAACAATCC 
45 CTGTTCAACCTCCTGCACGAAGAAGCCTCGCCGTTGGTTGTCGAAACAGTCAAACTCTTG 
TCCACTTCCGACGACAGCGCGGCATTGATTGAAAAAGTATTGCCGCAATTGGACGAACAA 
CAAACCCACGATTTAACCTTGGCCTGCGGCCTGTTCGCCCAGATTTTGAACATCGCCGAA 
GACGTGCACCACGAACGCCGCCGCCAAATCCACGAAGAAGCCGGACGCGGCGGCGCGGAA 
GGCAGCCTGACGGAAACCGTCCGCAGGCTCAAAGCGGGGAAAGCCGACGGCAAATCGGTG 
50 CAGCGGCAGTTGGACAATACGTCCGTTACCGCCGTTTTGACCGCGCACCCGACCGAAGTG 
CAACGCCAAACCGTCTTAAGCTTCAACCGCCGCATCCGCGCACTGTTGCCGCAACGCGAA 
CGCTGCACCAATGCCGACGCGCTGGCACGGCTGCGCCGCGAAATCGACACTATCCTGCTG 
GGCTTGTGGCAGACCAGCGAAACGCGCCGCCACAAACTCAGCGTCAACGACGAAATCAAC 
AACGGCGTGTCCATCTTCCCGATGAGCTTTTTCGAAGCCCTGCCCAAGCTCTACCGCAAG 
55 ATGGAACACGACTTTCAGACGGCCTATCCCGGCGTCCGCGTTCCGGACATCCTCAAAATC 
GGCGGCTGGATCGGCGGCGACCGCGACGGCAATCCGTTTGTTTCTGCCGAAACCCTGCGC 
TTTGCCTTCCGCCGCCACGCCGATGCCGTGTTCCGCTTCTATCGCGGCGAACTCGACAAA 
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CTCTACCGCGAACTGCCGCTCTCCATCCGCCGCGTCAAAGTCAACGGCGATGTAACGGCG 
TTGTCCGACAAATCGCCCGACGAAGAAATCGCCCGCGCCGAAGAACCCTACCGCCGCGCC 
ATCGCCTACATTATGGCGCGCGCTATGGGCAAAGCGCGCGCGCTCGGTTTGGGTATGGGC 
TGCAAATTCGGCTTTCTCGAGCCTTATGCTTCGGCACAAGAGTTTCTGGATGATTTGAAA 
5 AAATTGCAACGTTCCCTTATCGACAACGGCAGCCGTCTGCTTGCCGAAGGCCGTTTGGCA 
GACCTCATCCGTTCCGTATCCGTGTTCGGCTTTCACATGATGCCGCTCGACTTGCGCCAA 
CACGCAGGCAAACACGCCGATGTGGTTGCCGAGCTTTTCCAACACGCAGGCTTGGAAGAC 
TACAACCGCCTGAACGAAGAGCTmAAACAAACCGCCCTGTTGCGCGAATTGAGCCATCAA 
CGTCCTCTGTACAGCCCGTTTATCACATACAGCGACCATACCCGCCACGAACTGGCAATT 
1 0 TTCAACGAAGCGCGCAAAATCAAAGACGAATTTGGCGAAGATGCCGTAACACAAAGCATT 
ATTTCCAACTGCGAACAACCCAGCGACCTGCTCGCCTTGGCATTGCTGCTGAAAGAAACC 
GGCCTGTTGGCGGTGGAAAACGGCAAACCGCACAGCCGCATCAATATCGTGCCGCTGTTT 
GAAACCATTGAAGCGTTGGAAAACGCCTGTCCGGTCATGGAAACCATGTTCCGCCTCGAC 
TGGTACGATGCACTGCTCGAAAGCCGTGGAAACATCCAAGAAATCATGCTCGGCTATTCC 
15 GACTCCAACAAGGACGGCGGCTACGTTACCAGCTCATGGTGCCTCTATCAGGCGGAATTG 
GGCTTGGTCGAACTCTTCAAAAAATACGATGTCCGTATGCGCCTGTTCCACGGACGCGGC 
GGCAGCGTAGGTCGCGGCGGCGGCCCTTCTTACCAAGCCATTCTCGCCCAACCGGCGGGC 
AGCGTGGCGGGACAAATCCGCATCACCGAACAAGGCGAAGTCATTACCGCCAAATACGCC 
GACCCCGGCAATGCCCAACGCAACTTGGAAACCTTGGTTGCCGCGACTTTGGAAGCCAGC 

20 ATCCTGCCGGATAAAAAAGACCCTGATGCCAAACTGATGCAGGCATTGTCGGACGTATCG 
TTCAAATACTACCGCGAACTGATTACCCATCCCGACTTCATCGACTACTTTCTGCAAACC 
AGCCCGATTCAGGAAATCGCCACCCTCAACCTAGGCAGCCGTCCCGCCAGCCGCAAAACC 
TTGGCGCGGATTCAGGACTTGCGCGCGATTCCGTGGGTATTTTCCTGGATGCAGAACCGC 
CTCATGCTGCCGGCTTGGTACGGTTTCGGCAGCGCGGTGGAAACCTTGTGCGAAGACAAA 

25 CCCGAAACGCTCGCCGCCCTGCGCGAACACGCCCAAAGCAACCCGTTCTTCCAAGCCATG 
CTCTCCAATATGGAACAAGTGATGGCGAAAACCGACATCACCCTCGCGGAAAACTATGCC 
GGCTTGAGCGAATCGCCCGATAAGGCAAAAATCATCTTCGGGATGATTAAGGAAGAATAC 
CGCCGCAGCCGCAAAGCACTGCTCGACCTACTGCAAACCGAAGAGCTTTTGCGCGACAAC 
CGCAGCCTCGCCCGTTCGCTCGCTTTGAGGATTCCCTACCTGAACGCGCTCAACGGTTTG 

30 CAAGTCGCCATGCTCAAACGCCTGCGTAAAGAACCCGACAATCCGCACGCCCTTCTGATG 
GTGCACCTGACCATCAACGGCGTGGCGCAAGGTTTGCGCAATACAGGCTGATAGTGCCGC 
ATCGGGGCAAAATGCCGTCTGAACGCCTTTCAGACGGCATTTCCCTGACCGCACTTGCAG 
AGAAACACCGATTGTTTTAAAGTGAACGGCAGTGATATGTTGAAAGACGACCAATGAAAA 
TTACCGTTATCGGCGCAGGTTCGTGGGGTACGGCGCTCGCCCTGCATTTTTCCCAACACG 

35 GCAACCGCGTATCCCTGTGGACGCGCAACGCAGACCAAGTCCGTCAAATGCAGGAAGCGC 
GTGAAAACA/^CGCGGACTGCCCGGCTTTTCCTTTCCCGAAACCTTGGAAGTGTGTGCGG 
ATTTGGCAGACGCGCTCAAAGACAGCGGACTTGTCCTTATCGTAACCTCCGTTGCCGGAT 
TGAGAAGCAGCGCAGAGCTGCTCAAACAGTACGGCGCGGGACACCTCCCCGTCCTCGCCG 
CCTGCAAAGGATTCGAGCAGGATACCGGGCTGCTGACCTTTCAAGTCTTGAAAGAAGTAT 

40 TGCCCGACAATAAGAAAATCGGCGTACTTTCCGGCCCGAGTTTTGCACAGGAACTCGCCA 
AACAACTGCCCTGCGCCGTCGTCCTTGCCTCCGAAAACCAAGAGTGGATTGAAGAACTCG 
TACCGCAGCTCAACACGACCGTCATGAGGCTTTACGGCAGTACCGATGTTATCGGCGTGG 
CGGTTGGCGGCGCGGTAAAAAATGTTATGGCGATTGCCACCGGATTGTCCGACGGCCTAG 
AGTACGGGCTT7>ACGCCCGTGCCGCACTGGTTACGCGCGGATTAGCTGAAATCACCCGCC 

45 TTGCCTCCGCAATGGGCGCACAGCCCAAAACCATGATGGGGCTGGCAGGCATCGGCGACC 
TCATCCTCACCTGCACCGGCGCACTTTCGCGCAACCGCCGCGTCGGCTTGGGTTTGGCAG 
AAGGCAAGGAACTGCATCAGGTGCTGGTCGAAATCGGACACGTTTCCGAAGGGGTCAGCA 
CGATAGAAGAAGTCTTCAATACTGCCTGTAAGTACCAAATCGACATGCCGATTACCCAAA 
CTCTGCTGCAACTCATCCGCAAAGAAATGACCCCGCAACAGGTTGTCGAAAGACTGATGG 

50 AACGCAGCGCGCGTTTTGAATAAACAACAGACAGATGCCGTCTGAAGCCTTCAGACGGCA 
TACGGACAGGTAAGGTTATGAAACAAAATATCGAAAAACTCGAAAGCAGCGTTTATACGT 
TGGTACAAAAATTCGAAACCCTCGTCAGCGAAAACCGCCGCCTCAAAGAAACCGTCGCCG 
AACTCAAACGGGCGCACGAGCGGCAAAAACTCGAACACGAAACCGCCGTCGACGAACTCA 
GCGAAGCCCTGCTCGTCCAAGTCGGCAAACTCAAAGAAGACCTGCAAAACAAAATTGACA 

55 GCCTGACAGAAGAAAATACACGATACCGCAGCCTGCTCGAACAGAGCAGGGAAAAAATCA 
GCGCACTGGCAGCGCGCCTCCCCCAATGGCAGGAAACGCAGCAATAAGGATTAAAGGATG 
AACATCGAACAAGTCCACATCGAAGTCATGCACGCCCGGCTGACCGTCAACACGCCGGCA 
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GAAGAAAAAGACACACTGTTGCAGGCAGTCGGAATGCTCAACGGCAARGCCGAAGCCATC 
CGCGAAGGCGGACGCGTCGCGGACAGCGAAAAAATCGTCATTATGGCCGCGCTCAACGTC 
GTCCACGACCTGTTGAAAACCTCCCTGAACGGCGGCGATTTGGCAATCGGCGATTTTGCG 
CGTAAAATAGCCGATATGGACAACGCCTGCCAAAAAGCACTATCCCGCTTGGCGCAGGAA 
5 TAACCCCTTTTTCCCTGCGGTGTCCGCGAAGGCATATATTCCTTTGAACCAAT7VAGTTCG 
CTTAAGGTTGCCGGGAGCAGTAGTGGGCGCGAGCGTCCTTTTGCGGACGCACCCGAAACT 
ACCCGAGGCAGCCGCCTTGTCAGCAAGGTTCAAGCGGATTCGGCCGAAACGGCTCTCGCG 
GGGATTCCCATTCAAAACCGCATCGCAACGATGCGGTTTTTCACATTGGCCGCCGCCCCG 
CCCGAAAAAGCAGTATGCAGTCTGCCGAAATCCGGAAAATCCTTCCGTAACGCGGTAATG 
1 0 CGATTGACAA7\ATCGAATCCTGTCTTTAAAATTTACAACTTTCTATATCTATCTGGATTT 
CCACTATGAAAACCTTTTCAGCGAAACCCCACGAGGTGAAGCGCGAATGGTTCGTCATCG 
ATGCCCAAGACAAAGTCTTGGGTCGCGTTGCGGCCGAAGTCGCCAGCCGTCTGCGTGGCA 
AACACAAACCTGAATACACCCCCCACGTCGATACCGGCGATTACATCATTGTTATCAATG 
CGGACAAACTGCGTGTAACCGGTGCCAAATTCGAAGATAAAAAATACTTCCGCCATTCCG 
1 5 GTTTCCCAGGCGGTATCTACGAACGCACCTTCCGCGAAATGCAAGAGCAATTCCCGGGCC 
GCGCTTTGGAACAAGCTGTAAAAGGTATGCTGCCCAAAGGTCCTCTGGGTTACGCCATGA 
TTAAAAAACTGAAAGTGTATGCGGGTGCGGAACACGCCCATGCTGCGCAACAACCCAAAG 
TTTTGGAACTGAAATAAGGACGCGACATGAACGGTAAATACTACTACGGCACAGGCCGCC 
GCAAAAGTTCAGTGGCTCGTGTATTCCTGATTAAAGGTACAGGTCAAATCATCGTAAACG 
20 GTCGTCCCGTTGACGAATTCTTCGCACGGGAAACCAGCCGAATGGTTGTTCGCCAACCCT 
TGGTTCTGACTGAAAACGCCGAATCTTTCGACATCAAAGTCAATGTTGTTGGCGGCGGCG 
AAACCGGCCAGTCCGGCGCAATCCGCCACGGCATTACCCGTGCCCTGATCGACTTCGATG 
CCGCGTTGAAACCCGCCTTGTCTCAAGCTGGTTTTGTTACCCGCGATGCCCGCGAAGTCG 
AACGTAAAAAACCGGGTCTGCGCAAAGCACGCCGTGCAAAACAATTCTCCAAACGTTAAT 
25 GTTGGAAATTCAAAAAACCCTGCTTATCGCAGGGTTTTTTATTTGTAGTAGACGGTTTCC 
CATTGGCAATCTAAAGATTACAGATTGGGCAAAAATCAAAAACAGTATAGTGGATTAACA 
AAAATCAGGACAAGGCGACGAAGCCGCAGACAGTACAAATAGTATGGCAAGGCGAGGCAA 
CGCCGTACTGGTTTTTGTTAATCCACTATACATGAAAAAATAGAAAAACTGCCGGGTATG 
GTTGAATAAGGGGTCAGACCGGTTCCAGTTCGCTCAGTCCGGGCAAATCTGCAAAACCGC 
30 GTTCGCGTATGATTTGGCAAAAATTGTTCAGATAGCTCTTGTCCGTATCTTCGGTACGGA 
TGGCGGCATACAGTTTGCTTTGCAGTCCGTCGGCAGTAATTTGGCGGTGGACGACATAGC 
CTTTTTCAAGGTAGGGCATGACTGTCCAATAGGGAAGGGCGGCAATGCCACGTCTGCTGG 
CAACCAGTTGGATAATGGCGATGGTCAGCTCGCTGTGTCGGCGCGGCGGGTTGATGTTTT 
TCGGAATCAGGATTTTTTTGGGCAAATCCAGCATCTCGTCGGGAACGGGATAAGTAATCA 
35 GGGTTTCCCCGATAAAGTCTTCCGCCGTCCAAACGTTTTTGGCGGCAAGCGGATGGTCTG 
GTGCGCAAATGCCGACCATTTCGTAGGCAAACAGCGGTTGGAAAGAAATACCGTTTTGTT 
TTTCCGCTTCGGAAACAATGGCAAGGTCGGCACGGTGTTGCAGCAGCAGTCCGACGGGAT 
CCGCTTGGAATCCCGATACGATATCCAATTCGACTTGGGGCCACATCGGGCGGAATTCGC 
CCATGGCGGGCATCAGCCAGTCGAAACAGGTATGGCATTCGACGGCAATCCGCAGCTCTC 
40 CCGCCTCTCCTTCCGTGATTCGCGCCAAATCCCATTCTGCAACAGCAACTTGAGGTATAA 
GTTCGTGGGCGAGGCGCAGCAGCCTTTCGCCCACCGGGGTAAAGCGCAAGGGCGTGGATT 
TGCGTTCGAACAGCGGCGTGCCGTAGTGGTTTTCGAGCATACGGATCTGGTGGGAAAGGG 
CGGATTGGGTAAGGAAAACCCGTTTGGCGGCAAGGGAGACGCTGCCGGTTTCTTCAAGTG 
CCAGCAGGGTTTTGAGGTGGCGCAATTCGATAATGGAATCCATGGGGCTTCAGACGGCAT 
45 ATTGAACCGGCGCATATTAAAATAATTCATATGCGGGTGCAAATCGGAAAAATGTGAATT 
CCGGACATTTTGCGATTAGAATGCCCGCTTGTTTAAAGCGATTAGGAAAAAAGATGGTAT 
TGTGCAGGGATTTTCTGACTTGGTGTAATGAAACATTGCAGACAGCGTTGTTTAAAGATT 
ACGCCCCTAACGGTTTGCAGGTTGAAGGGAGGGAATATATCGGGAAAATCGTTACGTCGG 
TAACGGCAAGCAGGGCAGCGATTGATTTTGCTGTGGAGCAGAAGGCAGATTTGCTTTTGG 
50 TACATCACGGTATGTTCTGGAAAAACGAGTTGCCGACCGTTACTGGTTGGAAAAAAGAAC 
GGATTGCCGCACTGTTACGGCACGACATCAATATGGCAGGCTACCATCTGCCCCTGGATG 
CACATCCCACACTGGGCAACAATGCCCAACTCGCCGACAGATTGGGTTTTGCGACAGAAA 
AACGGTTCGGCGAACAAAACCTGCTCAACTCGGGCAGCCTGAAACAAGCCAAGACACTCG 
GCGCATTGGCGGCGCATATTGAAACAGTTTTGCAACGTAAGCCTGTCGTTATCGGCAATC 
55 CCGAACGCGAAATCCGACGGGTTGCATGGTGCAGCGGCGGGGCGCAGGGGTTTTTTCAGA 
CGGCAATAGACGAAGGTGTCGATCTGTATTTGACGGGGGAAATCTCTGAAGCCCAATACC 
ACCTTGCCAATGAAACGGGTACGGCTTTCATTTCGGCAGGGCATCACGCGACGGAACGTT 
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ACGGCGTACGCGCGCTGGCAGAATCGGCGGCAGAGGTTTTCGGGTTGGAAGTGTGCCATT 
TTGACGAAAACAATCCGGCTTGAGTCTTGAGAAATATCATAAAACTTTACCTTATTTTAA 
ATAATGCTTTGAGTATCACTGCAAACTGGGTAAAATTGCAACGTTTAATGGCTCGGTATT 
CCCAAAATATTAGGACGACTGAATAATGGATAATCAAGAAATCAACAACGGCCGCCGCCG 
5 TTTCCTGACACTCGCGACCTGCGGCGCGGGCGGAGTGGCAGCATTGGGTGTGGCAACGCC 
GTTTGTGGCCAGTTTTTTCCCTTCGGAAAAAGCCAAGGCCGCCGGTGCTGCCGTCGAGGT 
GGATGTCAGTAAAATCGAAGCGGGTCAGCTGCTGACCGCCGAGTGGCAAGGCAAACCGAT 
TTGGGTGCTCAACCGTACAGATCAGCAGCTTAAAGACCTGAAAGGCCTGAACGGCGAACT 
TACCGATCCC7\ATTCCGATGCGGAACAGCAGCCGGAGTATGCTAAAAACGAGACCCGTTC 
10 GATTAAGCCGAACATCCTTGTCGCCATCGGTATCTGCACCCATTTGGGCTGCTCGCCCAC 
CTTCCGTCCCGACATTGCCCCCGCCGATTTGGGTGCAGACTGGAAAGGCGGCTTCTTCTG 
CCCCTGCCACGGTTCGAAATTCGACTTGGCCGGCCGCGTATATAAAGGTGTTCCTGCCCC 
GACCAACCTGGTTGTCCCGCCATATAAATACTTGAGCGACACAACTATCTTGGTGGGCGA 
AGACTAAGAATAAGGAACGATAATTATGGCAAACCAAACCAATAGCAAAGCAAAAGCATT 
1 5 GTTAGGCTGGGTAGATGCCCGTTTTCCATTAAGTAAAATGTGGAAAGAGCATCTGTCTGA 
ATACTATGCGCCTAAAAACTTCAACTTCTGGTATTTCTTCGGCTCATTGTCTATGCTGGT 
GCTGGTGATTCAAATCGTCAGCGGTATTTTCCTGACCATGAACTACAAACCGGACGGCAA 
CCTTAACGCCTACCATCTGCCTGCTGCCTTTACCGCAGTAGAGTACATCATGCGCGACGT 
GTCCGGCGGCTGGATTATCCGCTATATGCACTCTACCGGCGCATCTTTCTTCTTCATCGT 
20 CGTTTATCTGCACATGTTCCGTGGTCTGATTTACGGTTCGTACAAAAAACCGCGCGAATT 
GGTGTGGATTTTCGGTTCCCTGATTTTCTTGGCATTGATGGCAGAAGCCTTTATGGGCTA 
CCTGCTGCCTTGGGGTCAAATGTCCTTCTGGGGTGCGCAGGTAATTATTAACCTGTTCTC 
CGCCATCCCTGTTATCGGTCCTGATTTGTCCACTTGGATCCGCGGTGACTTCAACGTTTC 
CGATGTTACTTTGAACCGATTCTTCGCCCTGCACGTTATCGCTGTACCTTTGGTATTGCT 
25 CGGCTTGGTTGTGGCTCATATCATTGCCTTGCATGAAGTGGGTTCCAACAACCCTGACGG 
TGTAGAAATCAAAAAGCTGAAAGATGAAAACGGTGTCCCTCTAGATGGCATACCTTTTTT 
TCCGTATTATGTTGTGCATGATATATTGGCAGTAACGATATTCTTGATTGTCTTCTGTGC 
CGTGATGTTCTTTGCACCTGAAGGCGGCGGCTACTTCTTGGAAGCGCCAAACTTCGATGC 
AGCGAATGCGCTGAAAACACCTCCGCACATTGCGCCGGTATGGTACTTCACTCCGTTCTA 
30 CGCAATTCTGCGTGCGATTCCTTCCTTTGCCGGTACTCAGGTATGGGGTGTAATCGGTAT 
GGGTGCAGCAGTTGTACTGATCGCCTTGCTGCCTTGGTTGGATAAAGGCGAGGTTAAATC 
TGTCCGCTATCGCGGCCCAATCTTCAAAACCGCATTGGTTCTGTTCATCATTGCCTTCAT 
CGGTTTGGGTATTTTGGGTGCAATGGTAGCAACTGATACGCGTACTTTGGTTGCACGTAT 
CCTGTCTTTCGTCTACTTTGCATTCTTCCTGGGTATGCCGTTCTATACCAAACTGGATAC 
35 CAACAAACCAGTTCCTGAACGCGTAACCATGAGCACTACTAAACAAAAAATTATGTTCTT 
TGTTTACGTCGGTATTACCGTTGTTGGTGCTTACTTGTTTGCAACCAATATCTGATGAGG 
GCAGCGAAAATGAAACAAACTCTGAAAAACTGGTTTGCTGCCTTATTGCTGGCAGTGCCT 
ATGAGTGCAGCCGTCGCCAGCGGCGGCGGACACTACGAAAAAGTCGATATCGACCTGCGT 
GACCAAGTCAGCCTGCAGCACGGTGCGCAAATCTTTACAAACTACTGTTTGTCTTGCCAC 
40 TCTGCAAGCGGTATGCGCTTCAACCGTCTGAAAGACATCGGTTTGACTGACGAAGAAATC 
AAGAAAAACCTGATGTTTACCACCGATAATGTCGGCGATGTCATGCATTCGGCGATGAAC 
CCGAAAGATGCGGCAAAATGGTTTGGTGCTGCTCCGCCCGATTTGACGTTGATTGCGCGT 
TCCAAAGGTGCAGACTACCTTTACGCCTATATGCGCGGCTTCTATAAAGATCCGACCCGT 
CCGAGCGGCTGGAACAATACTGTATTCGATAAAGTCGGTATGCCCCACCCGTTGTGGGAG 
45 CAACAAGGTGTTCAAGCCGTTGAGTTGGATGCCAAAGGTCAGCCGGTTATGGTAAAAGAC 
GAACACGGCGAGATGAAGCCTAAGCTGTATTGGGAATCTACCGGTTTGCACAGCCGCCGC 
CTGCCTAACGGCAAAGTGATCCAAAAAGAGTACGACGCATATGTACGCGATTTGGTCAAT 
TACCTTGTGTACATGGGCGAACCTGCACAACTGCAACGCAAACGTATAGGCTATGTCGTG 
ATGATTTTCCTATTTGCGGTTATGCTGCCTTTGGCTTACTTCCTAAATAAAGAATATTGG 
50 AAAGACGTACACTAAGCGTTTGGAACAAAAGGGCAAATCCTTTAGGGTTTGCCCTTTTTT 
CATTTTGCCTGCCGTTTGAAAAGCCTGAATCCGTATGCCGTCTGAACATAGCTGCAACAT 
TTCAGACGGCATCATTCTAAAAATGTCAGACATCGGGAACTTAATCAGGGTTTGTGGCAG 
ATTGATATTGAATTAGGAAAAGGTCTGAAACCGCAAGCAGCATGGTAAGGCGATGTCGAA 
CAGGTTTAAAGCCAATCCGCTATATTTTTTGGAGCTATGTCGAACAAACAGGTAGAATCG 
55 GAATAAGCGGCGGTTTTGTCCGGTGTGAAAATGTTTGAATATTAGGATAATTGATGGAAC 
TGATGACTGTTTTGCTGCCTTTGGCGGCGTTGGTGTCGGGCGTGTTGTTTACATGGTTGC 
TGATGAAGGGCCGGTTTCAGGGCGAGTTTGCCGGTTTGAACGCGCACCTGGCGGAAAAGG 
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CGGCAAGATGTGATTTTGTCGAACAGGCACACGGCAAAACCGTGTCGGAATTGGCGGTGT 
TGGACGGGAAATACCGGCATTTGCAGGACGAAAATTATGCTTTGGGCAACCGTTTTTCCG 
CAGCCGAAAAGCAGATTGCCCATTTGCAGGAAAAAGAGGCGGAGTCGGCGCGGCTGAAGC 
AGTCGTATATCGAGTTGCAGGAAAAGGCACAGGGTTTGGCGGTTGAAAACGAACGTTTGG 
5 CAACGCAGCTCGGACAGGAACGGAAGGCGTTTGCCGACCAATATGCCTTGGAACGCCAAA 
TCCGCCAAAGAATCGAAACCGATTTGGAAGAAAGCCGCCAAACTGTCCGCGACGTGCAAA 
ACGACCTTTCCGATGTCGGCAACCGTTTTGCCGCAGCCGAAAAACAGATTGCCCATTTGC 
AGGAAAAAGAGGCGGAAGCGGAGCGGTTGAGGCAGTCGCAT ACCG AG T T G C AGGAAAAGG 
CACAGGGTTTGGCGGTTGAAAACGAACGTTTGGCAACGCAAATCGAACAGGAACGCCTTG 
10 CTTCTGAAGAGAAGCTGTCCTTGCTGGGCGAGGCGCGCAAAAGTTTGAGCGATCAGTTTC 
AAAATCTTGCCAACACGATTTTGGAAGAAAAAAGCCGCCGTTTTACCGAGCAGAACCGCG 
AGCAGCTCCATCAGGTTTTGAACCCGCTAAACGAACGCATCCACGGTTTCGGCGAGTTGG 
TCAAGCAAACCTATGATAAAGAATCGCGCGAGCGGCTGACGTTGGAAAACGAATTGAAAC 
GGCTTCAGGGGTTGAACGCGCAGCTGCACAGCGAGGCAAAGGCCCTGACCAACGCGCTGA 
15 CCGGTACGCAGAATAAGGTTCAGGGCAATTGGGGCGAGATGATTCTGGAAACGGTTTTGG 
AAAATTCCGGCCTTCAGAAAGGGCGGGAATATGTGGTTCAGGCGGCATCCGTCCGAAAAG 
AGGAAGACGGCGGCACGCGCCGCCTCCAGCCCGACGTTTTGGTCAACCTGCCCGACAACA 
AGCAGATTGTGATTGATTCCAAGGTCTCGCTGACAGCTTATGTGCGCTACACGCAGGCGG 
CGGATGCGGATACGGCGGCACGCGAACTGGCGGCACACGTTGCCAGCATCCGTGCACACA 
20 TGAAAGGCTTGTCGCTGAAGGATTACACCGATTTGGAAGGTGTGAACACATTGGATTTCG 
TCTTTATGTTTATCCCTGTCGAACCGGCCTACCTGTTGGCGTTGCAGAATGACGCGGGCT 
TGTTCCAAGAGTGTTTCGACAAACGGATTATGCTGGTCGGCCCCAGTACGCTGCTGGCGA 
CTTTGAGGACGGTGGCGAATATTTGGCGCA^iCGAACAGCAAAATCAGAACGCACTGGCGA 
TTGCGGACGAAGGCGGCAAGCTGTACGACAAGTTTGTCGGCTTCGTACAGACGCTCGAAA 
25 GCGTCGGCAAAGGCATCGATCAGGCGCAAAGCAGTTTTCAGACGGCATTCAAGCAACTTG 
CCGAAGGGCGCGCGGGAATCTGGTCGGACGCGCCGAGAAACTGCGTCTGTTGGGCGTGAA 
GGCAGGCAAACAACTTCAACGGGATTTGGTCGAGCGTTCCAATGAAACAACGGCGTTGTC 
GGAATCTTTGGAATACGCGGCAGAAGATGAAGCAGTCTGACTTGTGCGGAAAAATATTGT 
TTCAGCCGGGGCGGGAATGCCGAAAGCGCGGCGTAACTGTACTGGTTCTGATTTTGGGTT 
30 TTTTGTTTGAAGTACTTACCAAAGCCTTGTCCGGCGAGGTACACCGGCAAGGCGGCGGTT 
GTCTGAGCCTTTTGTGTTTCAGACGGCATCAGTGCAGATGATGCCGTCTGAAGACCGTAG 
GGAAGGCGGTTAGAAAAACGGATTGTGCCGCTTTTCGTGTCCGATGGAAGTCATACGCCC 
GTGTCCGGCGACAACTTGCACGGTTTCGGGAAGGGTGAATAATTTGTTGCGGATATTATT 
GATTAAGTCGGCGTGGTTGCCGCGCGGAAAATCGGTTCTGCCTATGGTTTCGTAAAACAG 
35 CACGTCGCCCGCAATCAGCAATTCCGCCTCGGCACAATAAAAGACGATATGTCCCGGCGT 
ATGGCCCGGAATATGCAGCACTTGAAAGGCATAGCGTCCGACCGTGAGCGTTTCGCCTTC 
TTCGAGCCAACGGTTCGGCGCAAAGGCGGGCGAGACGGGAAATCCGTATTGCGCGGTGGT 
TTGCGGCAGCGATTGGAGCAGGAATTCATCGTCCGGATGCGGCCCGAGGACAGGGACTTT 
ATGCGTTTTCAACATTTCGACCACGCCGCCCGCGTGATCGAGATGGCCGTGCGTCAGCCA 
40 GATTGCCGTGAGCGTAAGTTTGCGGTTTGCCAACGCTTGCAGCAGGAACGGCACGTCGCC 
GCCGACATCGGTCAGGACGGCTTCGCCGCTTTCGTCGTCCCAAATCAGGGTGCAGTTTTG 
GCGGAAGGGGGTAACGGGGAAGATTTCGTAACGTAGGGTCATCGGCGTGTTCCTAAACGG 
TTTTTCAGACGGCATCGGGTTTGCCGTTTGTTTTATGGCGGTTTGCCGCCCGTTTTGATA 
TTGGGGGAATCAGATGATTAAGAAGACAATCGGCGGCATCATACCGATTTTTACGGCGGT 
45 TTTCATCCCTGCATCAGCAGGCGCGGCGGATTTGATGCTGGCGCAGGAATACAAAGGGCA 
GGATATTGCCGGCTGGGCGATGAGCGAGAAACTCGACGGCGTGCGCGCCTATTGGGACGG 
AAAGCACCTGATGAGCCGTCAGGGCTACGCGTTTGCTCCGCCCAAAGGTTTTACCGCTCA 
GTTTCCGCCTTATCCTTTGGACGGCGAATTGTATAGCGGACGTGGTCAGTTCGAGCAGAT 
TTCCGCTACCGTGCGTTCTGTTTCTTCAGACTGGCGCGGCATCCGCCTGCACGTTTTCGA 
50 TGTACCCAAGGCGCAGGGCAACCTCTACCAACGTTTGGCAGTCGCAACGCAGTGGCTGAA 
AACGCATCCGAACGCGCCGATTACCATCATCCCGCAAATCAAAGTGCGCGACCGGCAGCA 
CGCGATGGACTTTTTAAAACAAATCGAAGCGCAGGGCGGCGAAGGCGTGATGCTGCGTCA 
GCCCGAATCCCGTTACAGCGGCGGCAGGAGCAGCCAATTATTGAAGCTGAAAAGCCAATA 
CGACGACGAATGCACGGTAACGCGGCACTATGAGGGCAAAGGGCGAAACGCCGGACGGCT 
55 GGGCGCGGTCGGCTGCAAAAACCGGCACGGCGAATTCCGCATCGGCAGCGGTTTCAAAGA 
TAAAGACCGCGACAACCCGCCCAAAATCGGCACACTGATTACCTACCGTTACCGTGGCTT 
TACGCGGAAAGGCACGCCGAAATTTGCCACATTTGTGCGCGTGCGTACCGACCGCTGAGC 
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AGGACAGTTCAGACGGCATCCGATACTTTGGTTTATAATTTTCCTTTTACCGACCGATTC 
CGACATATGACCGATTTAGAAACCAAACGCCTTGAAACACAGGCGATGCTTGAAAACGCC 
GATCTTTTGTTCGACCAAGGCCAATGCCGTGCCGCACTGCAAAAAGTGGCGGACGAGATT 
ACGCGTGATTTGGGCGGCAAATATCCGCTGCTGCTGCCCGTGATGGGCGGCGCGGTGGTG 
5 TTTACGGGGCAGTTGCTGCCGCTGTTGCGTTTTCCCTTAGATTTTGATTATGTTCACGTT 
TCCCGTTACGGCGACAAGCTGGAGGGCGGCGCGTTC7VACTGGAAGCGTATGCCCGATGCG 
GAACAAATCCGGGGCAGGCACGTCGTCGTGCTGGACGATATTTTGGACGAAGGGCATACG 
ATGTCCGCCATTCAAGCCAAACTTTTGGAAATGGGTGCGGCAAGCTGCCGTGCGGCGGTG 
TTCGCCAACAAGCTGATTGACAAACCCAAGCCTATCCGAGCCGATTATGTCGGACTGGAT 
10 GTGCCGAACCGTTATGTTTTCGGTTACGGCATGGATGCGGCGGGCTGCTGGCGCAATCTG 
GGCGAGATTTACGCATTGGGCGGAAAATAAGGGCGCGATGCCGTCTGAAGGCTGTTCAGA 
CGGCATCGCGGCCATACGCCGGCAGGATAATGAGGAACAGGACGCATAATATGATAGGGC 
TTTTAATCATCACACACGAAACCATAGGCGAAGCCTACCGCAAGCTGGCGCATCATTTTT 
TTCCGGGCGGACTGCCTGAAAACGTCCGCATACTCGGCGTGCAGCCGACGGAAGACCAAG 
1 5 ACGACATCAACAACAACGCCATTGCCGCGCTTCAGGAATTTCCCGACAACGACGGCGTGC 
TGATTATGACCGATATTTTCGGTGCGACCCCCTGCAATGCCGCCCGCCGCCTCGTGCGCG 
AAAACAAATCGGCGATTTTGACCGGGCTGAACGCGCCGATGATGGTTAAGGCCGTCCAAT 
ATTCGCCGGCGGCGGAAGACCTTGCCGCCTTTACCGAATGCGTCAGGGAGGCGGCGGTAA 
AAGGCATTTTCGCCATCACGTCCGCGCCCGAAGATTTGGTGTGCCGGCGCAGCGGCGATG 
20 CCGTCTGAAGAAGCGGCAGGGCAGGAAAACATTTTAAACGGCGTCTGCTGCCGATATACA 
TAACACGGGAATCGAAATGCTCAAACAATCCATCGAAATCATCAACAAACTCGGACTCCA 
CGCCCGCGCGTCCAACAAGTTCACCCAAACCGCGTCCCAATTCAAAAGCGAAGTCTGGGT 
TACGAAAAACGACAGCCGCGTCAACGGCAAAAGCATTATGGGGCTGATGATGCTCGCCGC 
CGCCAAGGGTACGGTCATCGAACTGGAGACGGACGGCGCGGACGAGGCGGAAGCGATGCG 
25 CGCCCTGACCGACTTAATCAACGGCTACTTCGGCGAGGGCGAATAATGAGTATCGTGCTG 
CACGGCGTGGCGGCGGGCAAAGGCATTGCCGTCGGTTGCGCCCACCTGATTGCGCGCGGT 
ACGGAGGAAGTGCCGCAGTATGATGTTGCGGAGGCGGACACCGATGCCGAAGCCGAACGT 
TTCGATGCCGCCGTCAAAGCCACGCGCAAAGAGTTGGAACAGCTCCGCAGCGCGATTCCC 
GAAAACGCCCCGACCGAGTTGGGCGCGTTCATCTCGCTACACCTGATGCTCTTGACCGAT 
30 GTTACCTTGTCGCGCGAACCCGTCGATATTTTAAGGGAACAAAAAATCAACGCCGAGTGG 
GCATTGAAGCAGCAGAGCGACAAACTCGCCGCCCAATTCGACAATATGGACGATGCCTAT 
TTGCGCGAACGCAAGCAGGATATGCTGCAAGTCGTCCGCCGCATCCACAACAACCTGATC 
GGGCAGGGCAACGAGTTGGAAGTTGCCGACAACCTGTTTGACGAAACCGTTCTGATTGCA 
AACGACCTTTCGCCCGCCGACACGGTTTTGTTTAAAGAGCAGCGCATTGCCGCCTTCGTT 
35 ACCGATGCCGGCGGCCCCACCGGGCATACGGCGATTTTGGGCAGGAGCTTGGACATCCCG 
TCCGTCGTCGGGCTGCACAACGCGCGCAAACTGATTACCGAAGGCGAAACGGTCATTGTG 
GACGGTATCAACGGCGTGTTGATTATCGCGCCGGATGAGTCGGTGTTGAACGAATACCGC 
CGCCGTGCCCGCGAATACCGCAGCCACAAACGCGATTTGAACAAGCTCAAAAAAACCGCC 
GCCGCCACCGCCGACGGGGTCTGCATCGAGCTTGTGGGCAATATAGAATCCGCCGAAGAC 
40 GTGAAACCGCTGCACAACCTCGGCGCAGACGGCATCGGGCTGTTCCGCAGCGAGTTTCTT 
TACCTGAACCGCGATACGATGCCGTCTGAAGACGAGCAGTACGAAGTGTACAGCGCGATT 
GTCAAAAAAATGAAAGGCAAAAGCGTAACGATACGGACAGTCGATTTAGGTGTGGACAAA 
AACCCGCGCTGGTTCGGGAAAAACAGCACGCCCAACGGCAGCCTCAACCCCGCGCTGGGC 
ATGACCGGCATCCGCCTGTGCCTTGCCGAACCGGTCATGTTCCGCACCCAGATGCGCGCC 
45 ATCCTCCGTGCGGCGGTACACGGCCCCGTGCGGATGATGTGGCCGATGATTACCTCCGTA 
TCCGAAGTGCGCCAGTGCCTCATCCACCTCGACACCGCGCAACGCCAGCTTGCCGAACGC 
GGCGATGCCTTCGGTAAAGTCGGCATCGGCTGTATGATTGAAATTCCGTCTGCCGCGCTG 
ACCGTCGGCAGTATTTTGAAACTGGTCGATTTCATCTCCGTCGGTACCAACGACCTGATT 
CAATACATCTTGTCCGTCGATCGCGGCGACGACAGCGTCAGCCACCTCTACCAGCCCGGC 
50 CATCCCGCCGTGCTGAAAATGCTGCAACACGTCATCCGTACCGCCAACCGCATGGACAAA 
GACGTATCCGTATGCGGCGAGATGGCGGGCGATACCGCGTTTACCCGCGTTTTATTGGGT 
ATGGGGCTGCGCCGTTTTTCCATGAACCCCAACAACATCCTGCCCGTCAAAAACATCATT 
CTGCACAGCAATGTCGGACAGCTCGAAAGTGATATTGTGAAAGTCATCCGCTGCGAAGAC 
GAAGAGAAGTCGGAAAAGCTGATCAAACAGATGAACAGCGTGTCTGTCGAGGAAGAAGCC 
55 GACTTCAAGGGGCGGTiAATAAATACGGCAGGTAAAAAATAGAAATACTTAACAATGCCCG 
CAATCTGAAATTTTGCCATTCTTGCAAAATAGAAAACCGAAACAGAAACCCAAAATCGGC 
CATTCCCTCAAAAACAGAAAACCAAAATCAGAAACCTAAAATCCGTCATTCCCGCGCAGG 
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CGGGAATCTAGGTTTGTCGGCACGGAAACTTATCGGGAAAAACGGTTTCTTTAGATTTTA 
CGTTCTAGATTCCCGCCTGCGCGGGAATGACGATGAAAAGATTGTTGTCGCTTCGGATAA 
ATTTTTGCCGTGTTGGGTTCTAGATTCCCGCTTTCGCGGGAATGACGGCAGAGTGGTTTC 
AGTTGCTCTCGATAAATGCCGCCATCTCAAGTCTCGTCATTCCCTTAAAACAGAAAACCG 
5 AAATCAGAAACCTAAAATCCCGTCATTCCCGCGCAGGCGGGAATCTAGGTCTGTCGGCAC 
AGAAACTTGTCGGGAAAAACGGTTTCTTTAGATTTTACGTTCTAGATTCCCGCCTGCGCG 
GGAATGACGATGAAAAGATTGTTGTCGCTTCGGATAAATTTTTGTCGCGTTGGGTTCTAG 
ATTCCCGCTTTCGCGGGAATGACGGCAGAGTGGTTTCTGTTGCTCCCGATAAATGCCGCC 
ATCTCAAGTCTCGTCATTCCCTTAAAACAGAAAACCGAAATCAGAAACCTAAAATCCCGT 
1 0 CATTCCCGCGCAGGCGGGAATCTAGGTTTGTCGGTGCGGAAACTTGTTGAAAACTTTGCA 
AAATCCCCTAAATTCCCACCAAGACATTTAGGAGATTTTCCATGAGCACCTTCTTCCAGC 
AAACCGCACAAGCCATGATTGCCAAACACATCAACCGCTTCCCGCTATTGAAGTTGGATC 
AAGTGATTGATTGGCAGCCGATCGAACAATACCTGAACCGTCAAAAAACCCGTTACCTTC 
GAGACCACCGCGGCCGTCCCGCCTATCCGCTGCTGTCCATGTTCAAAGCCGTCCTGCTCG 

1 5 GACAATGGCACAGCCTCTCCGATCCCGAACTCGAACACAGCCTCATTACCCGCATCGACT 
TCAACCTGTTTTGCCGTTTCGACGAACTGAGCATCCCCGATTACAGCACCTTATGCCGCT 
ACCGCAACTGGCTGGCGCAAGACGACACCCTGTCCGAATTGCTCAAACTGATCAACCGCC 
AACTGACCGAAAAAGGTTTAAAAGTAGAGAAAGCATCCGCCGCCGTCATTGACGCCACCA 
TTATTCAGACCGCCGACGGCAAACAGCGTCAGGCCATAGAAGTCGATGAAGAAGGACAAG 

20 TCAGCGGCCAAACCACACGGAGTAAGGACAGCGATGCGCGTTGGATCAAGAAAAACGGCC 
TCTACAAACTCGGTTACAAACAACATACCCGCACCGATGCGGAAGGCTATATCGAGAAAC 
TGCACATTATAGTAGATTAAATTTAAATCAGGACAAGGCGACGAAGCCGCAGACAGTACA 
GATAGTACGGCAAGGCGAGGCAACGCCGTACTGGTTTAAATTTAATCCACTATATTATGC 
GCAAAGCCTGCCGCAACCGTCCGCTGACGGAGGCGCAAACCAAACGCAACCGATATTTGT 

25 CGAAGACCCGTTATGTGGTCGAACAGAGCTTCGGTACGCTGCACCGTAAATTCCGCTACG 
CCCGGGCAGCCTATTTCGGACTGATTAAAGTGAGTGCGCAAAGCCATCTGAAGGCGATGT 
GTTTGAACCTGTTGAAAGCCGCCAACAGGCTAAGTGCGCCCGCTGCCGCCTAAAAGGCGA 
CCGGATGCCTGATTATCGGGTATCCGGGGAGGATTAAGGGGGTATTTGGGTAGAATTAGG 
AGGTATTTGGGGCGAAAATAGACGAAAACCTGTGTTTGGGTTTCGGCTGTTGTGAGGGAA 

30 AGGAATTTTGCAAAGGTCTCAGATTGTTGTCGCTTCGGATAAATTTTTGCCGCGTTGGGT 
TCTAGATTCCCGCTTTTGCGGGAATGACGGCAGGGTGGTTTCAGTTGCTCCCGATAAATG 
CCGCCATCTCAAGTCTCGTCATTCCCTCAAAAACAGAAAACCAAAATCAGAAACCTAAAA 
TCCCGTCATTCCCGCGCAGGCGGGAATCTAGGTCTGTCGGCACAGAAACTTGTCGGGAAA 
AACGGTTTCTTTAGATTTTACGTTCTGGATTCCCGCCTGCGCGGGAATGACGATGAAAAG 

35 ATTGTTGTCGCTTCGGATAAATTTTTGCCGTGTTGGGTTCTAGATTCCCGCTTTTGCGGG 
AATGACGGCAGGGTGGTTTCAGTTGCTCCCGATAAATGCCGCCATCTCAAGTCTCGTCAT 
TCCCTCAAAAACAGAAAACCAAAATCAGAAACCTAAAATCCGTCATTCCCGCGCAGGCGG 
GAATCTAGGTCTGTCGGCACGGAAACTTATCGGGAAAAACGGTTTCTTTAGATTTTACGT 
TCTAGATTCCCGCCTGCGCGGGAATGACGATGAAAAGATTGTTGTTGTTTCGGATAAAAT 

40 TTTGCAGCCCTGATAAAAAAATATGGCTGCTTTGGTAAAAAAATGCCGTCTGAAAGGTTT 
TCAGACGGCATTTTGTTTTTAAGAAGCATCAGCGGAAGCGGACGATTTCCCGTTCTTCGA 
TATGGATGCGTACCGTATCCTTGCCCGATACCGCCCCGGCGTGCCGCATATCGAGGTTCA 
GCCACAGGATGCCGTGTTCCGGATGGAGGACGGACAGGCTGAACGATTCGGGCAAACAGG 
TACGGGATAATACGCGGCACTCCATGCCGTCTTGGTCGAAACGCACCGCATGTTGCGGAA 

45 TATGGCGGTTATCGTCGGTATTGGGCAAACCCATCAGTCGGGCGACCTGCACGCAGGATG 
GTGTTTTGACCAATGTTTCGGGCGTACCGTATTGTAGAATCCTCCCTTTATGCATCACGG 
CGATTTCGTCTGCCGTCGTACAGGCTTCTTCGGGCGAATGCGTTACCAAAACGGCAGGGA 
TGCCGCCGTTTCGGATACGTTCGGCAGTCATACGGCGCAGCGTGCCGCGCAAATGCGTGT 
CCAAACTGGAAAACGATTCGTCCAACAGCAGCAGGGAAGGGCGGACAACCAAAGCGCGCG 

50 CCAACGCCAGCCGTTGCTTCTCGCCTCCGGAAAGTTTTTCAGGCTTGCGGTGCGCCTCGT 
TTTCCAGTCCGACTTCGGCAAGTGCCGCCATGGCGAGGCGTTCGGCTTCGGCTTTCGGCA 
TTTTTTGCATTTTCAAACCGAATGCCGCATTTTCCAGCGCACTCATATGGGGAAACAGCG 
CGTAATCTTGAAACATCAGCGAGATACGGCGTTTTTCGGGCGGCATACGGGTAATGTTTT 
CTCCGTTCAGCCATATTTCCCCGCCGTCCGGCCGGACAATCCCCGCAATTATATTCAGCA 

55 GGGTGGATTTTCCGCAGCCCGACCGCCCCAAAACGGCGAGTATTTTGCCGCGCCCGACAG 
TCAGGCAGATGTTGTCGGCGACGGTTTTATTGCCGAAGCGTTTGCAGAGTCCGTTCAGTT 
CAAGCATGGCGCATCCTATAAACGTATGCCGTCCAGCCACTCGGACAGCGGATGGGCGGC 
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ATCAATCAGACCGTAACGGCAAAACGCGTCCAGCGTAACCAGTTGCGCGTCGTGCATCAT 
GTTTCCCGACAACATGGCATCCAACAGACCGCCGATGTCCATTTTCTCAAAACCCGCCAC 
TTCGCCATCCTGATTTTCAGGCAGGAAGGTTTCGGGCAGGACGGCATCGAATACATACAG 
GATTTCATTGTGTACACCCCGGCTGACGGAGCGCAGGCTGTGCAGCTGCGATACCGGGCG 
5 GATGAGCGGAAGCAGCGTTTTATCCAAACCGGCTTCTTCGCTGCTTTCGCGACACACGGC 
TTCAGACGGCATTTCGCCGCCGGAAACACCGCCGGCGGCAGTATTGTCGAGTTTGTTGGG 
ATCGACTGCTTTGTGCGGACTGCGCCTGCCTATCCAGAAATGCCATCGGCCGTCCGATTC 
GGTCAGACCGTTGAGATGGACGGCGCGGCTGAGCAGTCCGAAAGGACGGAAAGCGGCGCG 
TTCGAGCGTGAACAAGGGGTTGCCGCCGCCGTCGGTCAGGTCGAAACACTCGTTGCGCCA 
10 GCCGTCCAACAGCCCCGCACAGTGCCAACCGAGGGCGAGGTGCTGTAAGCGTCCGCCCAT 
ATCAGGCCAGCCGTCCGCATTCAGAAAAATGCCGTCTGAAGACTCCGAGCAGCCTGCCTC 
CCAGTCTTTTTTGACGCGTTCCACCCATTCCGGCGACAGGTTGCCCAAAGGCAGACCGTT 
CAGATACAGCGTTTTCCAGCAACTTTCTGCACCGTAACTTGCTTTTGCCCACTCGAACAG 
AGCATCAAGGTCTTGTTTGCTGACGGATTCGGTAAAACGGACGGTCGGCATGGTTGGCTT 
1 5 TCTTGAAGACAATGCGGAAGATTGTAGCCAAACTGCCTGAATTTGCAACCGCAAACAGAC 
GGCGGTCTGCCGTTCCGATATTGCGGCCGGAAGGAAAATCCGGCAAAAAACGGAGGCGGC 
AGGAAACAGGAAACAGGAAACAGGCAAAAC.^\?\CCCCTCGGCATTGAACCGAGGGGTTT 
GATATTTGGTTGCGGGAGCAGGATTCGAACCTACGACCTTCGGGTTATGAGCCCGACGAG 
CTACCATGCTGCTCCATCCCGCGTCAGAAGATGAAACTATACGGCAGATTTTTTATGTTG 
20 TCAAACATTACTTCCGCGAAAAATCATAAAATTTTGCCGTCCGGCGGTTTATGCCTGATT 
TGAAATATTATTTTCTTTACAAAAGTTCATGTTCGTGATTTAATTTTGGTTAACATTGAA 
ACAGGGGTGCTGCCTGATGTTTAGGCGGCTGAGAAATACCCTTTACACCCGATCGGGATA 
ATACCTGCGTGGGGAGTTTTCACGGATTCTGCTTTTCAGACGGCATTGGTTTTCAAATGC 
CGTCTGAAAACGCAAAACGCTCCTGTTTCTTTAATTCTAAACGAGAAAACAGGAGCATTT 
25 TTTTATGACTACGCCAAAAAAAACCGCCAAAACTTCCGGCAACGAAGCGCGCGAGCTTGC 
CGACTTGAGCGAAGACATCGGCATCTGCTTTAAATATCCGAACTCGGAACGCGTGTATCT 
GCAAGGCAGCCGCGACGACATCCGCGTGCCTTTGCGCGAAATCCGTCAGGACGACACCTA 
CACGGCGCAAGGTACGGAAGCCAATCCGCCGATTCCCGTCTACGACACCAGCGGCGTGTA 
CGGCGACCCGGCGGCGCATATCGACCTGAAACAAGGTCTGCCGCACATCCGCACCGCGTG 
30 GCTGGACGAACGCGGCGATACCGAAATCCTGCCCAAGCTCTCCAGCGAATACGGCATCGA 
ACGCGCACACGATCCGAAAACCGCCCATCTGCGTTTCAACCAAATCACCCGCCCGCGCCG 
CGCGAAAAGCGGCAGCAACGTAACCCAGCTTCACTACGCGCGCCAAGGCATTATCACGCC 
CGAAATGGAGTTTGTCGCCATACGCGAACGTTTAAAATTAGACGAATTGTCCCAAAAACC 
GGAATACGCCAAACTCTTGGAACAGCACGCGGGGCAAAGTTTCGGTGCGAACATCCCGAC 
35 CCATCCCGACCAAATCACGCCCGAATTCGTGCGCCAAGAAATCGCCGCCGGACGCGCGAT 
TATTCCCGCCAACATCAACCACCCCGAACTCGAACCGATGATTATCGGCCGCAACTTTCG 
TGTCAAAATCAACGGCAACTTGGGCAACTCCGCCGTCACCTCCAGCCTGACCGAAGAAGT 
CGAAAAAATGGTGTGGTCGCTGCGTTGGGGCGCGGACACGATTATGGATTTGTCCACCGG 
CGCGCACATCCATGAAACGCGCGAATGGATTATCCGCAACGCGCCCGTCCCCATCGGCAC 
40 CGTGCCGATTTACCAAGCGTTGGAAAAAACCGGCGGCATCGCCGAAGATTTGACTTGGGA 
TTTGTTCCGCGACACCCTCATCGAACAGGCGGAGCAAGGCGTGGACTATTTCACCATACA 
CGCGGGCGTGTTGCTGCGTTATGTGCCGATGACCGCCAACCGCCTCACCGGCATCGTCTC 
GCGCGGCGGTTCGATTATGGCGAAATGGTGTTTGGCACATCATCGGGAAAATTTCCTCTA 
CACGCATTTCGACGAAATCTGCGAAATTATGAAAGCGTATGACGTATCGTTCAGCCTCGG 
45 CGACGGCCTGCGCCCCGGCTGCATTGCCGATGCCAACGACGAATCCCAATTCGCCGAACT 
GCACACCTTGGGCGAATTGACCGATAAAGCGTGGAAACATGACGTACAAGTCATGATCGA 
AGGCCCCGGCCATGTGCCGCTGCAACGCGTCAAAGAAAACATGACCGAAGAGCTGCAACA 
CTGCTTTGAAGCACCTTTTTACACGCTCGGCCCGCTCGTTACCGACATCGCACCCGGCTA 
CGACCACATCACCTCGGGCATAGGCGCGGCCAATATCGGCTGGTACGGCACGGCGATGCT 
50 TTGTTACGTTACCCCGAAAGAGCATTTGGGGCTGCCCGACAAAGAAGACGTGCGCACCGG 
CATCATCACCTACAAACTCGCCGCCCACGCCGCCGATCTCGCCAAAGGCTGGCCGGGCGC 
ACAATTACGTGACAACGCCCTGAGCAAAGCGCGTTTCGAGTTCCGCTGGCGCGACCAATT 
TCGCTTAAGCCTCGACCCTGAACGTGCCGAGAGCTTCCACGACGAAACCCTGCCTGCCGA 
AGGCGCGAAAATCGCCCACTTCTGCTCGATGTGCGGCCCCAAATTCTGCTCGATGAAAAT 
55 CACGCAGGAAGTGCGCGACTACGCCGACAAGCAAAAAGCCCAACGGCAGGGCATGGAGGA 
AAAAGCGGTCGAGTTCGTCAAAAAAGGCGCGAAGATTTACAGTTAAACGTCAAGCAAAAA 
ATGCCGTCTGAAAACCGGAAAAAAGGCTTCAGACGGCATTCTTTCGCTTGTGAAAATATA 
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GTGGATTAACAAAAATCAGGACAAGGCGACGAAGCCGCAGACAGTACAGATAGTACGGAA. 
CCGATTCACTTGGTGCTTCAGCACCTTAGAGAATCGTTCTCTTTGAGCTAAGGCGAGGCA 
ACGCCGTACTGGTTTTTGTTAATCCACTATAATTTTTAAAATTTTTATATTCACATAGAT 
GGGGATGATGGGATTTAGGATTCTGATTTTGTTTTTGAGAGAATGAAGGAATTTGAGATT 
5 GTGGGCGATTATCGGGAAAAATAGAATCTTTCCGCCGTCATTCCCGCGCAGGCGGGAATC 
TAGACATTCAATGCTAAGGCAATTTATCGGGAATGACTGAAACTCAAAAAACTAGATTCC 
CACTTTCGTGGGAATGACGGGATTAAAGTTTCAAAATTTATTCTAAATAGCTGAAACTCA 
ACGCACTGGATTCCCGCCTGCGCGGGAATGACGAAGTGGAAGTTACCCGAAACTTAAAAC 
AAGCGAAACCGAACGAACTAGATTCCCACTTTCGTGGGAATGACGGCAGAGCGGATTCTG 

1 0 TTGCTCCCGATAAATGCCGCAACCTCAAATCCCGTCATTCCCGCGCAGGCGGGAATCTAG 
GTCTGTCGGTGCGGAAACTTATCGGGTAAAACGGTTTCTTGAGATTTTGCGTCCTGGATT 
CCCACTTTCGTGGGAATGACGGGATTAAAGTTTCAAAATTTATTCTAAATAGCTGAAACT 
CAACGCACTAGATTCCCGCCTGCGCGGGAATGACGGCATATTTTGACATTGAATAAAAAA 
GACTAAAAACAGGAAAAAGCCAAACAGAAAAAAGCCAAACAGAAAAAAGCCAAACAGAAA 

1 5 ACAGCGAAACAGAAAACAGCGAAACAGAAAACAGCGAAACAGAAAACAGCGAAACAGAAA 
ACAGCGAAACAGAAAACAGCGAAACAGAAAACAGGAAAAAGCCAAACAGAAAACAGGAAA 
AAGCCAAACAGAAAAAAGCCTGTCTGGCGACAGGCTTTTTGTTGATACCAATCTTTGCAG 
ATTAGAATTTGTGGCGCAGACCGACACCGCCGGCAGTCGCTACGAATTTGTTTTCGCCTT 
TGCCTTCTTGCAACCAACCGGCAGAAACCAAGGCAGAAGTGCGTTTGGAGAAGTCGTATT 

20 CCGCACCGACAACCACTTGGTCGTATTCGTTGCCTATGTCTGCATCATCAACCAAACCTT 
TGAAGCCGTGGGCGTAAGAAACTCGGGGCGTTACGTTGCCGAAGCGGTATGCCAAGGTAG 
CGGCAACTTCGGTTTGAGAGTTGTGCGAATTGGAAGCATCAGTCAGTTTCGCGTCTTGTT 
GCTGTACGGCTACGGAAGCGTACAGGGCATCATTGTCGTAACCGCTGACCAAACGGTGAA 
TCTGGTATTTCTCAATATTCAAGCCCTCTTGCACTTGATGATGTCTTTTATAGGCACCGC 

25 CATATTGCACGAAGAAGCCACCGTTTTTGTAGTTGAAGCCGGCGTGGTAAGATTCGCTGT 
TATGTCTGCCTGCATTGTCGTTAAGCGCGTATTGTACGCTGCCGCTGAGGCCGGCAAATT 
CGGGAGAATCGTAGCGTACGGAAATGAGGCGTGCCTCGGGTTCGGCAATTTTGTTTACAC 
CCAAATAGTCGCTTTTGCTATCCCAAGGATTGATGTCGCCGGTGTCTTTCAGGACGCTGT 
TCAAACGACCGACGCGCAATTTACCGAAGCCGCCTTTCAAGCCGATGAAGGATTGGCGGT 

30 TGCCCCAACCGGAGTCAGTACCGGCGATAGATGCTTTTTGCTCAACCTGCCAAATGGCTT 
TCAGGCCGTTACCGAGGTCTTCTTGGCCTTTGAAGCCGATTTTCGAACCCAAATCAACGA 
TGCCGGTAGCGGTTGTAACTTCAGTAACTTGGCCGTTCTGGTGAAATACAGAGCGGGAAG 
TTTCTACGCCGGCTTTGATGGTGCCGTACAGGGTAACGTCAGCCATTGCTGCAACAGGAA 
GGGCTGCCAAAGTCAGGGCAATCAGGGATTTTTTCATTGCTGTATTCCTTTTTTGGTTAA 

35 GAAATTTAAGCCGGCCGGGCTTTCCAAGCCGCTTAGCTTTGCATTTACCGCCGACGTTTT 
CGGCGATTCCATGTCTGAAACTATAGATGTTCCTGTCTGAAAAAACAAACTTCTGCCTGT 
TTTTGTGTGGAATTCGATGTGTATTTTGAAGGGCGGATAGAGATATTATGGGTATTTTTT 
GTTTTATAACATATGGTTATTTAAATTTTTTAAGATTTGCATTTTTACAACACTTACTCG 
GGAGGGTATTGGAGGGCATTGCAAACCGGGGGTTATAAAGACGGCAAAAAACGCGCACGT 

40 GGTTTCTTTCGGAACGGGCGGATGCAAAACCCGCCCCTGCGGCAGGAGATAGTGGATTAA 
CAAAAATCAGGACAAGGCGACGAAGCCGCAGACAGTACAAATAGTACGGAACCGATTCAC 
TTGGTGCTTCAGCACCTTAGAGAATCGTTCTCTTTGAGCTAAGGCGAGGCAACGCTAACC 
GTCATTCCCGCCACTTTTCGTCATTCCCGCTCAGGCGGGAATCTAGAATCTCGGACTTTC 
AGATAATCTTTGAATATTGCCGCTGCCTTAAGGTCTGGATTCCCGCCTGCGCGGGAATGA 

45 CGGCTGCAGATGCCCGACGGTCTTTAGAGTGGATTAACAAAAATCAGGACAAGGCGACGA 
AGCCGCAGACAGTACAAATAGTACGGAACCGATTCACTTGGTGCTTCAACACCTTAGAGA 
ATCGTTCTCTTTGAGCTAAGGCGAGGCAACGCTAACCGTCATTCCCGCCACTTTTCGTCA 
TTCCCGCTCAGGCGGGAATCTAGAATCTCGGACTTTCAGATAATCTTTGAATATTGCTGT 
TGTTCTAAGGTCTAGATTCCCGCCTGCGCGGGAATGACGAATCCATCCGCACGGAAACCT 

50 ATATCCCGTCATTCCTACGAACCTACATCCCGTCATTCCCTCAAAAACAGAAAACCAAAA 
TTAGAAACCTAAAATCCCGTCATTCCCGCGCAGGCGGGAATCCAAACTTGTCCGCACGGA 
AACTTATCGGATAAAACGGTTTCTTAGATTCCACGTTCTAGATTCCCGCCTGCGCGGGAA 
TGACGAATCCATCCGCACGGAAACCTATATCCCGTCATTCTTACGT^CCTACATCCCGTC 
ATTCCCTCAAAAACAGAAAACCAAAATTAGAAACCTAAAATCCCGTCATTCCCGCGCAGG 

55 CGGGAATCTAGGTCTGTCGGTGCGGAAACTTATCGGGTAAAACGGTTTCTTTAGATTCCA 
CGTTCTAGATTCCCGCCTGCGCGGGAATGACGGCTGCAGATGCCCGACGGTCTTTATAGT 
GGATTAACAAAAATCAGGATAAGGCGACGAAGCCGCAGACAGTACAAATAGTACGGAACC 
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GATTCACTTGGTGCTTCAACACCTTAGAGAATCGTTCTCTTTGAGCTAAGGCGAGGCAAC 
ACCGTACTGGTTTAAAGTTAATCCACTATAATGAACACAATCCATTCAGACTATTCAATC 
AGGCAAACATCTCCTGCAATACTGCAAACAGTTTTTCAGCCGTACTGTTGTCTAAATTGC 
CAAGATGTTTGACCAATCCGGCTTTATCCACAGCCCTAATCTGTTCGGGCAAAAGCAAAC 
5 CGTCTTTATCCTGAAAGCGGACATTGACGCGGAACGGGGCAGGACGGCTTCCGCTCGTCA 
TGGG7VACGATCAGCACAGTCTTGAGATAGTTGTGTATTTCAGGAGGAGAGACTACGACAC 
AAGGACGTGTCTTTTTGATTTCGCTTCCTACGGTCGGGTCTAAGGAGACCAGATAGATTC 
CGCCGCGTACTACCATATCCATTCTTTATCCGCTTCGTTTTCAATTTCGGAAAAAAAATG 
CTCCTGCTCGGTTTCGACAAGCATTGCGGCAGCTTCTGCCCATCCCCTGCGAACGGTAGG 
1 0 ACAGCTTAAAATAATATTGCCCTTTTCAACTGTAACAGCCAAGCTGTCTACTGCCCCTAT 
TTGACCCAATAATGATTTGGGCAGAATCACGCCTTGCGAGTTTCCCATTTTGCGTATGTT 
GAGAATCATATACGTACCATATTCACCTGTTTATGTAATAACAATGTTAGTACCTTGATG 
AGGTAGTGTCAACATGGAAAAAGATTGCCTGACAGTTTGTCCGATTTCAAAATCTCCGCG 
ACAAGCATGTTTTAAAGCCATTCGGGGATTTGGGGGCGGATGATGCCGTATGCCTCGGGA 

1 5 TAGTCGACGCCGGTCAGGTAAAGTCCGTCGGGCATGAAGGTCGGCGGGGCTTTGAGGCGG 
CTGCGTTCTTGAATCAGTGCGGCGT^GCCTTCGACGCTGAGTCTGCCGCTGCCGACATAA 
ACGAGCGCGCCCATGATGTTGCGTACCATGTGGTGCAAAitiAgGCGTTGCCGTGCAAATCG 
AGGCGGACGAGTCCTGAGCTTTGGGTAAGGTTCGGCGCGGTAGATGGTTTTGACGGGGGA 
TTTTGCTTGGCATTCGGCGGCGCGGAAGCTGGAGAAGTCTTGTTCGCCGACCAATAAGGC 

20 GGCAGCCTGCCGCATCTGCCCGATGTCGAGTTTGAGGTGTGTCCAGCCTGCCCTGTTTTT 
GAGCAGGGGGGAACGGACGGGGGCGGATTCGAGCAGGTAGCGGTAGTGCCGTCCGTATGC 
GTCAAATCGTGCATGAAATTCGGGGGCGACCTGTCGGGCGTGCAAAACGGCAATGCCTTC 
GGGCAGGTGGGCATTTACGCCGCGCACCCATGCCTGTTGGGGACGGGCGGCAGTTGTGTC 
GAAGTGGACGACTTGGGCGGTGGCATGCACGCCGGTGTCGGTCCTGCCGGCAACGGTGGT 

25 GGAAACCGCTTCCCCTGCTATTTGGGCGAGCGCGGTTTCCAATGCCGCCTGAACGGTCGG 
TACGCCGTCAGCCTGTTTCTGCCAGCCGTAAAAGCGGCTGCCGTCATAGGATAGGGTTAT 
TGCCCAGCGTTGTTTTTGTGCGGTATCCATCGGATTTGGGATTCGGATAAATGTTCAGAC 
TGCATTGTATCGCAGATTTTGCAGGGAAACGGCAAACGCCCAGGGCGAGCGGCGTTGTTT 
GGGGAGTTGTTGGGGGGGGGGGATGCAGTTGCTACGAATCGCTATCCTGTGAATTTACCC 

30 TGTCAGGAGTGCCCGAATCGTCATTCCCGCGCAGGCGGGAATCTAGGACGTAAAATCTAA 
AGAAACCATTTTATAGTGGATTAACAAAAACCAGTACAGCGTTGCCTCGCTTTAGCTCAA 
AGAGAACGATTCTCTCAGGTGCTGAAGCACCAAGTGAATCGGTTCCGTACTATTTGTACT 
GTCTGCGGCTTCGTCGCCTTGTCCTGATTTTTGTTAATCCACTATATCCGATAAGTTTCC 
GCACCGACAAAACTAGATTCCCACTTTCGTGGGAATGACGGGATGCAGGTTCGTGGGAAT 

35 GACGCGAACAGAAACCTCAAATCCCGTCATTCCCGCGCAGGCGGGAATCTAGACCTTAGA 
ACAACAGCAATATTCAAAGATTATCTGAAAGTCTGGGATTCTGGATTCCCACTTTCGTGG 
GAATGACGGAATGTAGGTTCGTAGGAATGACGGGATGCAGGTTTCCGTATGGATGGATTC 
GTCATTCCCGCGCAGGCGGGAATCTAGGTCTGTCAGTGCGGAAACTTATCGGATAAAACG 
GTTTCTGGAGATTTTTCGTCCTGGATTCCCACTTTCGTGGGAATGACGCGGTGCAGGTTT 

40 CCGTATGGATGGATTCGTCATTCCCGCGCAGTCGGGAATCTAGACATTCAATGCTAAGGC 
AATTTATCGGAAATGACTGAAACTCAAAT^CTGGATTCCCACTTTCGTGGGAATGACGG 
GATGCAGGTTTTCTTAACCCCGCGTTCTAGATTCCCACTTTCGTAGGAATGACGGCGGTA 
GATTTGGCAGATGCGGCGGATTCGGCAGGTCTCAACCCATCCTACAATCCACCCTGACCG 
CCTTTTACGAATCCGCCGCCATCCTGGGAAACGCAAAAAAAATGCCGTCCGAAAACCTTT 

45 CAGACGGCATTTTCGCGGGCAAATCAGTAAAAGACTTCGCGTCAGCTTAAGCGTTTCATA 
CCGCACGTCGGACCGGGCGGGTTTCGGGTTTCGGAAGAGCGGTGTGAAATGAAACGCGGC 
AGACGGCGTGTGTGCCGCCATCCTGCCCCGAACCGGCGGCAAAGCGTCATCGCGGTTTGA 
ATCGGGCGGGCGCGTGCCTGTTCCGGCACGGATGCCGTCTGAAAGCGGCGGGTCGGTCCG 
GTCCGCGTCAGCCTAAGGCGAAAAGTTCCCTGCCGTGGATTTTGAGCCACTGTTTCGCTT 

50 TGGGCGTGTAGGGGGCGAAGCGGCGGGTCAGTTCCCAAAAGGCGGGGCTGTGGTCGGGAT 
GGGCGAGGTGGCAGAGTTCGTGTATGCAGACATAGTCGGCAACGTATTCCGGTGCGCCGA 
CCAGCCGCCAGTTGAAGCGTATGCCTGTGGTTTTGCGGCACACGCCCCAGAAGGTTTTGG 
CAGAGGTCAGCGAGGAGGAGGCGGGGAACAGTTGTGTGGTGCGGGCGTGGCGTTCGAGGC 
GGGG7VATCAGGTAACTGTGCGCCTGCCGTTCCAAAAAGTCCCGCAGCAGCGCAAGCTGTT 

55 TTTCGGGTGCGCCTTCGGGAACACGGATTTCAGACGGCATCAGCAGGATTTGCGTGTCTT 
GATGGGCGGTGAGGGCAAGCTGTCTGCCGTGGAAGAGGATGGATTCGGGCAGCCGGTTTT 
CGGCAGTTTGCGGCGGCGGTGTTTTCGCCAGTGTTTGCCGCAGGACGGCTTCGTTTTCAT 
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ACAGCCAGCGGTTTAGAGCGGAGACGGAGAAGCAGGGTGGGACGCTGATGCGGACGGTAT 
GTGTGCCGGCGGGGCGGATAATCAGGTTTTTCTTGGCACGGCGCTTGATTTCGACGGTCA 
GTTCCATGCCGTCTGAAAGGGTGTGGACAAAGGCGGTCATGCGGTTTCAGACGGCATTTT 
GGCGGCAAACGGGCCTGCGCCGGAAATAAGCGGTTGTTGCGTTTCGATGAGATGTTCGCA 
5 TTTTTCCATCAATTCGGCTTCGCTGCCGCTTGCGTGCGGGATGGTCGGACAGATGACGAC 
GGTGATTTCCCCCGGATATTTCAGAAAGGAGTTTTTCGGCCAAAATTCGCCGCTGTTGAG 
GGCGACGGGGACGATGTCCATCTCAAACATTTTCGCCATGCGCGCGCCGCCGAGTTTGTA 
TTTGCCGCGTTTTCCGGGCGCAAGGCGCGTGCCTTCGGGGAAAATGGTAATCCAATAGCC 
TTCGTTTTTGCGCACCAACCCCTGTTTTATGAGCTGCTCGTTGGCTTCGCGGCGGTTGTT 
1 0 GCGGTCTATGCCTATGGTTTTGACCAGTTTCAAGCCCCAGCCGAAAAAGGGGATTTTGAA 
CAACTCGCGTTTGGCAACGTAAACCTGCGGCGGAAAAATGTCCTGAAGGGCGAGCGTTTC 
CCAGCCGCTTTGGTGTTTGGCGCAGATGACGGCGGGGCGGTCGGGGATGTTTTCCGCGCC 
GATGATGCGGTATTTGAGCCCGACGATGTGTTTGAGCGACCAGTTGAGAATGCCGACCCA 
GACCCGCGCCATCTTGTGCGCCCCGTCCCGGAAAGGCGAGGCGAGCAGCATAAAGGGAAA 
1 5 GAGGAAAATCAGGGTGGAACAGAGTATCAGCCAGTAAATCAGGTTGCGGATGATGAGCAT 
GGTTTTGCCTTGTCGGAATGCGGTATGTTCAGTCGGCTTGCGGTGCGGTGTTTTCCTGCA 
TGATGTATTGTGAGAAATCGAGCAGGGTATCGAAAACCTGTGTGTGTTCGGGCAATTCGT 
GTCCGTGTTGGGAGAGCGTTTTTTTGCCTTTTCCGGTCAGAACCAGCGCGGGTTTGCCGC 
CGACGGCATCGATTGCCTGCAAATCGCGCAGGCTGTCGCCGACCAGCCAGGTTTCCGAGG 
20 CTTGGGCGTTGAAGCGTCCGATGATGTCTTCAATCATACCCGGTTTGGGCTTGCGGCAGT 
TGCAGTTGTCGGCATCGGTGTGCGGGCAGAACCAGATGCCGTTGATTTCGCCGCCTGCCT 
GACGGACGAGGCGGTGCATTTTGGCGTGCATTTCGGTAAGGTTTTGAACGGTAAAATATT 
TGCGCCCGATGCCGGATTGGTTGGTGGCAACGGCGACGGTGTAGCCTGCCTGCGTCAGAA 
ATGCCACCGCATCCATGCTGCCTTCGACAGGTATCCACTCGTCAACGGATTTGACGAAGT 
25 CGTCGCGGTCCTGATTGATGACGCCGTCGCGGTCGAGAATGATGAGTTTCATCGCGTTTC 
CTTTGGATTGGGGCAGGTCGGGGGTGGCATTATACTGAAATATCGGTGGAAATGCGCCTG 
TGCCGCGCGATAACGCGCCTGTTCCGGCAACCGCTTGCAATGCGCGGGTACGGCGTTTCG 
GGGGCGGTTGCCGATGTTTTCCTGCCTGCTCCTGCAACGGCGGATTTTCTGCCGGACGTA 
TTTTCCGCTGCAAAAGTCTCGGCGCGGAAGTTGCGGGGCAATGCCGTTTGGGGACGCGGT 
30 CAGGCGGCAAGCATTTCAAGAATCAGCCCGATGTGTTTGTCTAAAGCCGGGTTGTAGTTC 
AGTGCCCGCATCGGGTCGGACAGGCGGTGTCCGCCCTGTTTTTTAAGGCTGACGGCTTGG 
GCAACGGCTTGCGCCAGGGTTTCCGGGTTTTGGCGTGAGAAAAAGAAGCCGGCTTCTTCG 
TTCATGACCTCTGTACATGCCATGTTTTCGGAGAGGACGACGCGTGTGCCGCATAGGACG 
GATTCGACGCCGACCAGCCCGAAGGGTTCGTACAGGGAAGCCATAATGGTAAAGTCGGCG 
35 GCGCGGTAGAGTTCGGGCATATCGGTGCAGAAGCCCAGTCCGACGACGTTTTTCATAGGG 
CGGGGAAGCGGGGAGCCGAC7\ACGGCGAGCTTC-ACGGGCAGGCTGGTATGTTCGAAAAAG 
TCGGCAAGCAGTTCCAGACCTTTGCGCGTGTGGCCGGTCGATGGGAACAGGAAAACGGTT 
TCATGGTCGGCAAAGCCGTATTTGGCGCGCAGGTCGGCAGTTTCTCCGGGTTGTGGAAAG 
AAGCGTTCCGTATCTGCGGGGGGGGGGGGCGACTTGGATTCTTTCAGGGGGAACGCCGTA 
40 CAGTCCGACCAGTTCGCGCCGCATCATATGGGAATGCGCCATAATCAGTTTGGCGGTGGC 
GTAGTTGCTGCGGTTGCGGCGTATGGCGAGGCGGTCGAGCAGGTTCGGTTTTTGCGCCAT 
ATGGTGCAGGTAGCCCAAGTGTGTGCCGCCGCAGATGAGGAGGTCGGCGTAATCGGCGTG 
GTGGCAGGCAATCAGTTTGGCGGCACTGTTTTTTCTGGTTTGAGCGAGCCGGCTTGAAAA 
GAGGAATGAGCGTAGTTTTTTCAGCGTCCGGTGTTGATCGACAAGATGGGGTTCGATTAG 
45 GGCGTATTCAGGAATGCTGTGATCAAATTTCGTCGCATAAACGGCCGGTGTGATGTTTTG 
TCTGTTCAGACCCTTTACCAAATCCAATGTGTAGCGTTCCGTGCCGCCGCCGTGTTTGAA 
GTTGTTGGTTGCAATGTCTATTTTGAGCTTCATCATTGTTCCTTTATGGTTGCGTCCCGG 
TTTGTCGGGGCGGGATTTGTGCGTGAGGGGCAGGGTAATGCGCTGTGTGCCGGAATACGG 
TTGCCGTTTGTTGCGGCAATGCCGTCTGAAGCCGCCGGCGGGCTTCAGACGGCATTTTGC 
50 CTTTATCCTTTAAATACGGGGACGAGTTCCATTTGGCTCAATACCTGTGCGGCGATGTTG 
ACGATGCCGAAAAGGAAGACCCAAACCATCAGCCACAAGCCGCCGTAAACTTTATAGGTT 
TTGCCTGCGCCGAATTTTTTGCGCGAACGGTAGAGCAGCATGGCGGGGATGATGCCTGTC 
CAGACGGTTGCCGCCAGGCCGACGTAGCCGATGGCGGTAACGAAGCCGGTGGGGAAGAGC 
AGGCAGGAAATCAGGGGCGGCAGGAAGGTCAGCGCGGCGGTTTTGGTGCGGCCGGAGATG 
55 CTGTCGTTCCATTTGAAGATGTCGGCGATGTAGTCGAAGAGTCCGAGCGTTACGCCTAAA 
AACGAGGTGGCGATCGCCATATAGGAAAACAGGGACAATATTTTGTCCATATTGCCGGTT 
TGGGCGAATTTGGACAGGGTTTCGATGAGGACGGAGACTTGCCCTTCGGCGGCGATGACG 
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GGGGCGAACTCGTTGCGCGGCAGGTTGCCTTGGATGGCGGTTTGCCAGAGGACGTAAATT 
ACCAGCGCAATCAGTGTGCCCGTCCAGATGGATTTAGCCACTTTGGGCGCGTCGCCTTTA 
AAGTATTTGAGCAGGCTGGAGACGTTGCCGTGGAAGCCGAAGGAAGCGAGGCAGACGGGC 
AGGGCGGTGGCGGCGTAAATCCAGTAGTTTGTGCCGGCGGGGGCTTGGGTATCGAAGAGG 
5 ACGGACGGCTTGGCATCGGCAATCAGCCCGCCGGCCGCCCAAATAAAGGTCAATACCATG 
CCGCCGATAAGGACGCCGGTGAAGCGGTCGACCAAGCGTGCGGATGCCCATACGCAAAAG 
GCGAGGATGCCGAAGAAGACGAGTTGTCCGACGGTGAGTGAAACGTCGCCGCCTGCCGCG 
CTGCCTAAGCCTTTGGCGGTCAGGTCGCCGCCGACGAAGATATAAGCGTAAGTAAGCAGG 
TATAAAACGAAGGCGACGGCGATGCCGTTGATGATGTTCCAGCCGCGTCCGAGCAGGTCT 
1 0 TTGACCATCGTGTCGAAACTTGCGCCGTGCGGATAATGGGTGTTGACTTCCAAAATCATC 
AGGCCGCTGGAAAGCATAGAAAACCAGGTGTACAGCAACACGGCCAGCGAGCCGGTAAAC 
CATACGCCGGATGTGGCGGTCGGGTTGGCGAGCATGCCTGCGCCGATGACCGTGCCGGCG 
ATAATCATCGCGCCGCCGAACAGTGAAGGGGTTTTGTTGGGCATATTTTGTCTTTCTGCC 
AGAAAAAGCGAGCCGCCATTATGCCGTAAAGTGTAAGGATTTGTAAGGTATTTGCGCCGC 
1 5 GCCGCCCGAAAAGGCTTTCAGACGGCATTGTGTTCCATAGTATAATCTTGGGTTTTGGAG 
TGGGCGGTTCGTCAGATGGGAGGGAAAATGTCCGACAAAAAATATAATGTCGATGAGGGG 
GAAATCGCCAAATTCAGCCGGATTGCCGACAAATGGTGGGACAAGTCGGGCGAGTTCAAA 
ACCTTGCACGACATCAATCCGCTGCGGCTGGATTATATCGACGGACACGCGGATTTGTGC 
GGCAAJiCGGGTTTTGGACGTGGGCTGCGGCGGCGGCATCTTGGCGGAAAGTATGGCGCGG 

20 CGCGGCGCGGCGTTTGTAAAGGGCATCGACATGGCGGAGCAGTCGTTGGAAACCGCCCGC 
CTGCACGCGGCTTTGAACAATGTCGCCGATATCGAATACGAATGTATCCGCGTGGAAGAC 
CTTGCCGAGGCGGAACCGCACTCGTTCGATGTGGTAACGTGCATGGAAATGATGGAACAC 
GTCCCCGATCCCGCCGCCATCGTGCGTGCTTGTGCCAATCTGGTCAAACCGGACGGCATG 
GTGTTTTTTTCCACCATCAATAAAAACCCGAAATCGTACCTGCATCTGATTGTGGCGGCG 

25 GAATATCTGTTGAAGTTTGTCCCCAAAGGCACGCACGACTGGAAAAAATTCATCGCACCT 
GCCGAGCTGGCGCGAATGTGCCGTCAGGCAGGCTTGGATGTGGCGGATACGAAGGGTATG 
ACTTACCATGTGTTGTCGCAAACTTATGCCCTGTGCGATTCGACCGATGTGAATTATATG 
TTTGCCTGCCGTCCGGCGTTCTGACGGCGGGTTTGCCCGTTTTTGAGCAAGTGAGTTGAT 
ATGTCTGTCTATACCAGTGTTTCCGATGATGAAATGCGCGGCTTCCTGAGCGGTTACGAT 

30 TTGGGGGAATTTGTTTCCCTGCAGGGCATCGCGCAGGGGATTACCAACAGCAATTATTTT 
CTGACGACGACTTCGGGACGTTATGTGCTGACCGTGTTTGAAGTGTTGAAACAGGAAGAG 
CTGCCGTTTTTTCTGGAGCTTAACCGGCATTTGAGTATGAAGGGCGTGGCGGTTGCCGCG 
CCGGTTGCGCGCAAAGACGGCCGGCTTGATTCCGTTTTGGCGGGTAAGCCTGCCTGCCTG 
GTTGCCTGCCTGAAAGGTTCGGATACCGCGCTGCCGACGGCTGAGCAGTGTTTTCATACC 

35 GGTGCGATGTTGGCGAAAATGCACCTTGCCGCCGCCGATTTCCCTTTGGAAATGGAAAAC 
CCGCGTTACAATGCGTGGTGGACGGAGGCGTGCGCCCGGCTGCTGCCCGTCCTGTCGCAA 
GACGATGCCGCACTGCTGTGTTCCGAAATCGATGCGTTGAAGGACAATCTCGGCAATCAT 
CTGCCTTCGGGCATCATCCATGCCGATCTGTTTAAAGACAATGTGTTGCTTGACGGCGGT 
CAGGTATCGGGCTTCATCGATTTCTATTACGCCTGCCGGGGCAATTTTATGTATGACTTG 

40 GCGATTGCGGTCAACGATTGGGCAAGGACGGCGGACAATAAGTTGGATGAGGCGTTGAAA 
AAGGCGTTTATCGGCGGTTATGAGGGCGTGCGCCCCTTGAGTGCCGAAGAAAAGGCGTAT 
TTCCCGACCGCCCAACGTGCCGGCTGCATCCGTTTTTGGGTGTCGCGCCTGTTGGATTTT 
CATTTTCCGCAGGCGGGCGAGATGACGTTTATCAAAGACCCGAACGCGTTCCGCAACCTG 
CTGTTGAGTTTGGGTTGAGTGCGTCCGGCGTTTGACAGAAATGCCGTCTGAAAGGGTTTC 

45 AGACGGCATTTTTATGGCTGATTAAAACGAAAATGAGACGATAGCCGGGTATTTTCCATT 
TTATAGTGGATTAACAAAAATCAGGACAAGGCGACGAAGCCGCAGACAGTACAGATAGTA 
CGGAACCGATTCACTTGGTGCTTCAGCACCTTAGAGAATCGTTCTCTTTGAGCTAAGGCG 
AGGCAACGCTGTACTGGTTTTTGTTAATCCACTATAAATCAATGTAATCCATTCTGTTCC 
CGATGTTTATGCCGTCTGAAACCCATCCTGTGTCGGGCTTCAGACGGCATATTGCTTCAA 

50 AGCAGGTTTTCCGAGGCAACCCAGTTCAGAATATCGGCTTCGACCGCTTCGGGCGTGCCG 
GGGTTGGCGATTTTGACCCCGTATTCGCCTTCGCCCAGTTCCTCGAGGATTTCCAGTTGC 
GAATCGAGCATCCCTGCTTTCATGTAATGTCCTTTGCGCGACATCATGCGCTCGAGGTTG 
ATGTCTTGCGGCGGACTGAGGTGGATGAAGGCAGCTTTGCCTTCGGCTCCGCGCAGAATG 
TCGCGGTAGCCGCGTTTGAGGGCGGAACAGGTTACGATGGTGTGGTTCGCACCGTTTTGC 

55 GCCTGTTGCGTCATCCAGTCGCGCAGATTGCCCAACCACGGATAGCGGTCTTCATCGGTC 
AGCGGAATACCCGCGCCCATCTTGTCGCGGTTGGCTTGGGTGTGGAACTCGTCGCCTTCG 
GCATAGGGACATTGACCGAGGTGTTTCTGCAGGGACAGCGCGGCGGTGGTCTTGCCGCAG 



